From de90d6f2181f8de55613833ba20c0784d5698a30 Mon Sep 17 00:00:00 2001 From: Nimanshi Jha <40792134+nimanshijha@users.noreply.github.com> Date: Mon, 5 Aug 2019 06:15:25 +0530 Subject: [PATCH 01/14] Delete Section 1 - Differential Privacy.ipynb --- Section 1 - Differential Privacy.ipynb | 1194 ------------------------ 1 file changed, 1194 deletions(-) delete mode 100644 Section 1 - Differential Privacy.ipynb diff --git a/Section 1 - Differential Privacy.ipynb b/Section 1 - Differential Privacy.ipynb deleted file mode 100644 index 54c6558..0000000 --- a/Section 1 - Differential Privacy.ipynb +++ /dev/null @@ -1,1194 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Lesson: Toy Differential Privacy - Simple Database Queries" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In this section we're going to play around with Differential Privacy in the context of a database query. The database is going to be a VERY simple database with only one boolean column. Each row corresponds to a person. Each value corresponds to whether or not that person has a certain private attribute (such as whether they have a certain disease, or whether they are above/below a certain age). We are then going to learn how to know whether a database query over such a small database is differentially private or not - and more importantly - what techniques are at our disposal to ensure various levels of privacy\n", - "\n", - "\n", - "### First We Create a Simple Database\n", - "\n", - "Step one is to create our database - we're going to do this by initializing a random list of 1s and 0s (which are the entries in our database). Note - the number of entries directly corresponds to the number of people in our database." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([1, 0, 0, ..., 1, 1, 1], dtype=torch.uint8)" - ] - }, - "execution_count": 1, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import torch\n", - "\n", - "# the number of entries in our database\n", - "num_entries = 5000\n", - "\n", - "db = torch.rand(num_entries) > 0.5\n", - "db" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Project: Generate Parallel Databases\n", - "\n", - "Key to the definition of differenital privacy is the ability to ask the question \"When querying a database, if I removed someone from the database, would the output of the query be any different?\". Thus, in order to check this, we must construct what we term \"parallel databases\" which are simply databases with one entry removed. \n", - "\n", - "In this first project, I want you to create a list of every parallel database to the one currently contained in the \"db\" variable. Then, I want you to create a function which both:\n", - "\n", - "- creates the initial database (db)\n", - "- creates all parallel databases" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# try project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Towards Evaluating The Differential Privacy of a Function\n", - "\n", - "Intuitively, we want to be able to query our database and evaluate whether or not the result of the query is leaking \"private\" information. As mentioned previously, this is about evaluating whether the output of a query changes when we remove someone from the database. Specifically, we want to evaluate the *maximum* amount the query changes when someone is removed (maximum over all possible people who could be removed). So, in order to evaluate how much privacy is leaked, we're going to iterate over each person in the database and measure the difference in the output of the query relative to when we query the entire database. \n", - "\n", - "Just for the sake of argument, let's make our first \"database query\" a simple sum. Aka, we're going to count the number of 1s in the database." - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [], - "source": [ - "db, pdbs = create_db_and_parallels(5000)" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "def query(db):\n", - " return db.sum()" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [], - "source": [ - "full_db_result = query(db)" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [], - "source": [ - "sensitivity = 0\n", - "for pdb in pdbs:\n", - " pdb_result = query(pdb)\n", - " \n", - " db_distance = torch.abs(pdb_result - full_db_result)\n", - " \n", - " if(db_distance > sensitivity):\n", - " sensitivity = db_distance" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor(1)" - ] - }, - "execution_count": 13, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "sensitivity" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project - Evaluating the Privacy of a Function\n", - "\n", - "In the last section, we measured the difference between each parallel db's query result and the query result for the entire database and then calculated the max value (which was 1). This value is called \"sensitivity\", and it corresponds to the function we chose for the query. Namely, the \"sum\" query will always have a sensitivity of exactly 1. However, we can also calculate sensitivity for other functions as well.\n", - "\n", - "Let's try to calculate sensitivity for the \"mean\" function." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "# try this project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Wow! That sensitivity is WAY lower. Note the intuition here. \"Sensitivity\" is measuring how sensitive the output of the query is to a person being removed from the database. For a simple sum, this is always 1, but for the mean, removing a person is going to change the result of the query by rougly 1 divided by the size of the database (which is much smaller). Thus, \"mean\" is a VASTLY less \"sensitive\" function (query) than SUM." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project: Calculate L1 Sensitivity For Threshold\n", - "\n", - "In this first project, I want you to calculate the sensitivty for the \"threshold\" function. \n", - "\n", - "- First compute the sum over the database (i.e. sum(db)) and return whether that sum is greater than a certain threshold.\n", - "- Then, I want you to create databases of size 10 and threshold of 5 and calculate the sensitivity of the function. \n", - "- Finally, re-initialize the database 10 times and calculate the sensitivity each time." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# try this project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: A Basic Differencing Attack\n", - "\n", - "Sadly none of the functions we've looked at so far are differentially private (despite them having varying levels of sensitivity). The most basic type of attack can be done as follows.\n", - "\n", - "Let's say we wanted to figure out a specific person's value in the database. All we would have to do is query for the sum of the entire database and then the sum of the entire database without that person!\n", - "\n", - "# Project: Perform a Differencing Attack on Row 10\n", - "\n", - "In this project, I want you to construct a database and then demonstrate how you can use two different sum queries to explose the value of the person represented by row 10 in the database (note, you'll need to use a database with at least 10 rows)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# try this project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project: Local Differential Privacy\n", - "\n", - "As you can see, the basic sum query is not differentially private at all! In truth, differential privacy always requires a form of randomness added to the query. Let me show you what I mean.\n", - "\n", - "### Randomized Response (Local Differential Privacy)\n", - "\n", - "Let's say I have a group of people I wish to survey about a very taboo behavior which I think they will lie about (say, I want to know if they have ever committed a certain kind of crime). I'm not a policeman, I'm just trying to collect statistics to understand the higher level trend in society. So, how do we do this? One technique is to add randomness to each person's response by giving each person the following instructions (assuming I'm asking a simple yes/no question):\n", - "\n", - "- Flip a coin 2 times.\n", - "- If the first coin flip is heads, answer honestly\n", - "- If the first coin flip is tails, answer according to the second coin flip (heads for yes, tails for no)!\n", - "\n", - "Thus, each person is now protected with \"plausible deniability\". If they answer \"Yes\" to the question \"have you committed X crime?\", then it might becasue they actually did, or it might be becasue they are answering according to a random coin flip. Each person has a high degree of protection. Furthermore, we can recover the underlying statistics with some accuracy, as the \"true statistics\" are simply averaged with a 50% probability. Thus, if we collect a bunch of samples and it turns out that 60% of people answer yes, then we know that the TRUE distribution is actually centered around 70%, because 70% averaged wtih 50% (a coin flip) is 60% which is the result we obtained. \n", - "\n", - "However, it should be noted that, especially when we only have a few samples, this comes at the cost of accuracy. This tradeoff exists across all of Differential Privacy. The greater the privacy protection (plausible deniability) the less accurate the results. \n", - "\n", - "Let's implement this local DP for our database before!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# try this project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project: Varying Amounts of Noise\n", - "\n", - "In this project, I want you to augment the randomized response query (the one we just wrote) to allow for varying amounts of randomness to be added. Specifically, I want you to bias the coin flip to be higher or lower and then run the same experiment. \n", - "\n", - "Note - this one is a bit tricker than you might expect. You need to both adjust the likelihood of the first coin flip AND the de-skewing at the end (where we create the \"augmented_result\" variable)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# try this project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: The Formal Definition of Differential Privacy\n", - "\n", - "The previous method of adding noise was called \"Local Differentail Privacy\" because we added noise to each datapoint individually. This is necessary for some situations wherein the data is SO sensitive that individuals do not trust noise to be added later. However, it comes at a very high cost in terms of accuracy. \n", - "\n", - "However, alternatively we can add noise AFTER data has been aggregated by a function. This kind of noise can allow for similar levels of protection with a lower affect on accuracy. However, participants must be able to trust that no-one looked at their datapoints _before_ the aggregation took place. In some situations this works out well, in others (such as an individual hand-surveying a group of people), this is less realistic.\n", - "\n", - "Nevertheless, global differential privacy is incredibly important because it allows us to perform differential privacy on smaller groups of individuals with lower amounts of noise. Let's revisit our sum functions." - ] - }, - { - "cell_type": "code", - "execution_count": 40, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor(40.)" - ] - }, - "execution_count": 40, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "db, pdbs = create_db_and_parallels(100)\n", - "\n", - "def query(db):\n", - " return torch.sum(db.float())\n", - "\n", - "def M(db):\n", - " query(db) + noise\n", - "\n", - "query(db)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "So the idea here is that we want to add noise to the output of our function. We actually have two different kinds of noise we can add - Laplacian Noise or Gaussian Noise. However, before we do so at this point we need to dive into the formal definition of Differential Privacy.\n", - "\n", - "![alt text](dp_formula.png \"Title\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "_Image From: \"The Algorithmic Foundations of Differential Privacy\" - Cynthia Dwork and Aaron Roth - https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf_" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This definition does not _create_ differential privacy, instead it is a measure of how much privacy is afforded by a query M. Specifically, it's a comparison between running the query M on a database (x) and a parallel database (y). As you remember, parallel databases are defined to be the same as a full database (x) with one entry/person removed.\n", - "\n", - "Thus, this definition says that FOR ALL parallel databases, the maximum distance between a query on database (x) and the same query on database (y) will be e^epsilon, but that occasionally this constraint won't hold with probability delta. Thus, this theorem is called \"epsilon delta\" differential privacy.\n", - "\n", - "# Epsilon\n", - "\n", - "Let's unpack the intuition of this for a moment. \n", - "\n", - "Epsilon Zero: If a query satisfied this inequality where epsilon was set to 0, then that would mean that the query for all parallel databases outputed the exact same value as the full database. As you may remember, when we calculated the \"threshold\" function, often the Sensitivity was 0. In that case, the epsilon also happened to be zero.\n", - "\n", - "Epsilon One: If a query satisfied this inequality with epsilon 1, then the maximum distance between all queries would be 1 - or more precisely - the maximum distance between the two random distributions M(x) and M(y) is 1 (because all these queries have some amount of randomness in them, just like we observed in the last section).\n", - "\n", - "# Delta\n", - "\n", - "Delta is basically the probability that epsilon breaks. Namely, sometimes the epsilon is different for some queries than it is for others. For example, you may remember when we were calculating the sensitivity of threshold, most of the time sensitivity was 0 but sometimes it was 1. Thus, we could calculate this as \"epsilon zero but non-zero delta\" which would say that epsilon is perfect except for some probability of the time when it's arbitrarily higher. Note that this expression doesn't represent the full tradeoff between epsilon and delta." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: How To Add Noise for Global Differential Privacy\n", - "\n", - "In this lesson, we're going to learn about how to take a query and add varying amounts of noise so that it satisfies a certain degree of differential privacy. In particular, we're going to leave behind the Local Differential privacy previously discussed and instead opt to focus on Global differential privacy. \n", - "\n", - "So, to sum up, this lesson is about adding noise to the output of our query so that it satisfies a certain epsilon-delta differential privacy threshold.\n", - "\n", - "There are two kinds of noise we can add - Gaussian Noise or Laplacian Noise. Generally speaking Laplacian is better, but both are still valid. Now to the hard question...\n", - "\n", - "### How much noise should we add?\n", - "\n", - "The amount of noise necessary to add to the output of a query is a function of four things:\n", - "\n", - "- the type of noise (Gaussian/Laplacian)\n", - "- the sensitivity of the query/function\n", - "- the desired epsilon (ε)\n", - "- the desired delta (δ)\n", - "\n", - "Thus, for each type of noise we're adding, we have different way of calculating how much to add as a function of sensitivity, epsilon, and delta. We're going to focus on Laplacian noise. Laplacian noise is increased/decreased according to a \"scale\" parameter b. We choose \"b\" based on the following formula.\n", - "\n", - "b = sensitivity(query) / epsilon\n", - "\n", - "In other words, if we set b to be this value, then we know that we will have a privacy leakage of <= epsilon. Furthermore, the nice thing about Laplace is that it guarantees this with delta == 0. There are some tunings where we can have very low epsilon where delta is non-zero, but we'll ignore them for now.\n", - "\n", - "### Querying Repeatedly\n", - "\n", - "- if we query the database multiple times - we can simply add the epsilons (Even if we change the amount of noise and their epsilons are not the same)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project: Create a Differentially Private Query\n", - "\n", - "In this project, I want you to take what you learned in the previous lesson and create a query function which sums over the database and adds just the right amount of noise such that it satisfies an epsilon constraint. Write a query for both \"sum\" and for \"mean\". Ensure that you use the correct sensitivity measures for both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# try this project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Differential Privacy for Deep Learning\n", - "\n", - "So in the last lessons you may have been wondering - what does all of this have to do with Deep Learning? Well, these same techniques we were just studying form the core primitives for how Differential Privacy provides guarantees in the context of Deep Learning. \n", - "\n", - "Previously, we defined perfect privacy as \"a query to a database returns the same value even if we remove any person from the database\", and used this intuition in the description of epsilon/delta. In the context of deep learning we have a similar standard.\n", - "\n", - "Training a model on a dataset should return the same model even if we remove any person from the dataset.\n", - "\n", - "Thus, we've replaced \"querying a database\" with \"training a model on a dataset\". In essence, the training process is a kind of query. However, one should note that this adds two points of complexity which database queries did not have:\n", - "\n", - " 1. do we always know where \"people\" are referenced in the dataset?\n", - " 2. neural models rarely never train to the same output model, even on identical data\n", - "\n", - "The answer to (1) is to treat each training example as a single, separate person. Strictly speaking, this is often overly zealous as some training examples have no relevance to people and others may have multiple/partial (consider an image with multiple people contained within it). Thus, localizing exactly where \"people\" are referenced, and thus how much your model would change if people were removed, is challenging.\n", - "\n", - "The answer to (2) is also an open problem - but several interesitng proposals have been made. We're going to focus on one of the most popular proposals, PATE.\n", - "\n", - "## An Example Scenario: A Health Neural Network\n", - "\n", - "First we're going to consider a scenario - you work for a hospital and you have a large collection of images about your patients. However, you don't know what's in them. You would like to use these images to develop a neural network which can automatically classify them, however since your images aren't labeled, they aren't sufficient to train a classifier. \n", - "\n", - "However, being a cunning strategist, you realize that you can reach out to 10 partner hospitals which DO have annotated data. It is your hope to train your new classifier on their datasets so that you can automatically label your own. While these hospitals are interested in helping, they have privacy concerns regarding information about their patients. Thus, you will use the following technique to train a classifier which protects the privacy of patients in the other hospitals.\n", - "\n", - "- 1) You'll ask each of the 10 hospitals to train a model on their own datasets (All of which have the same kinds of labels)\n", - "- 2) You'll then use each of the 10 partner models to predict on your local dataset, generating 10 labels for each of your datapoints\n", - "- 3) Then, for each local data point (now with 10 labels), you will perform a DP query to generate the final true label. This query is a \"max\" function, where \"max\" is the most frequent label across the 10 labels. We will need to add laplacian noise to make this Differentially Private to a certain epsilon/delta constraint.\n", - "- 4) Finally, we will retrain a new model on our local dataset which now has labels. This will be our final \"DP\" model.\n", - "\n", - "So, let's walk through these steps. I will assume you're already familiar with how to train/predict a deep neural network, so we'll skip steps 1 and 2 and work with example data. We'll focus instead on step 3, namely how to perform the DP query for each example using toy data.\n", - "\n", - "So, let's say we have 10,000 training examples, and we've got 10 labels for each example (from our 10 \"teacher models\" which were trained directly on private data). Each label is chosen from a set of 10 possible labels (categories) for each image." - ] - }, - { - "cell_type": "code", - "execution_count": 49, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np" - ] - }, - { - "cell_type": "code", - "execution_count": 54, - "metadata": {}, - "outputs": [], - "source": [ - "num_teachers = 10 # we're working with 10 partner hospitals\n", - "num_examples = 10000 # the size of OUR dataset\n", - "num_labels = 10 # number of lablels for our classifier" - ] - }, - { - "cell_type": "code", - "execution_count": 55, - "metadata": {}, - "outputs": [], - "source": [ - "preds = (np.random.rand(num_teachers, num_examples) * num_labels).astype(int).transpose(1,0) # fake predictions" - ] - }, - { - "cell_type": "code", - "execution_count": 56, - "metadata": {}, - "outputs": [], - "source": [ - "new_labels = list()\n", - "for an_image in preds:\n", - "\n", - " label_counts = np.bincount(an_image, minlength=num_labels)\n", - "\n", - " epsilon = 0.1\n", - " beta = 1 / epsilon\n", - "\n", - " for i in range(len(label_counts)):\n", - " label_counts[i] += np.random.laplace(0, beta, 1)\n", - "\n", - " new_label = np.argmax(label_counts)\n", - " \n", - " new_labels.append(new_label)" - ] - }, - { - "cell_type": "code", - "execution_count": 57, - "metadata": {}, - "outputs": [], - "source": [ - "# new_labels" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# PATE Analysis" - ] - }, - { - "cell_type": "code", - "execution_count": 58, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "9" - ] - }, - "execution_count": 58, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "labels = np.array([9, 9, 3, 6, 9, 9, 9, 9, 8, 2])\n", - "counts = np.bincount(labels, minlength=10)\n", - "query_result = np.argmax(counts)\n", - "query_result" - ] - }, - { - "cell_type": "code", - "execution_count": 59, - "metadata": {}, - "outputs": [], - "source": [ - "from syft.frameworks.torch.differential_privacy import pate" - ] - }, - { - "cell_type": "code", - "execution_count": 61, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Warning: May not have used enough values of l. Increase 'moments' variable and run again.\n" - ] - } - ], - "source": [ - "num_teachers, num_examples, num_labels = (100, 100, 10)\n", - "preds = (np.random.rand(num_teachers, num_examples) * num_labels).astype(int) #fake preds\n", - "indices = (np.random.rand(num_examples) * num_labels).astype(int) # true answers\n", - "\n", - "preds[:,0:10] *= 0\n", - "\n", - "data_dep_eps, data_ind_eps = pate.perform_analysis(teacher_preds=preds, indices=indices, noise_eps=0.1, delta=1e-5)\n", - "\n", - "assert data_dep_eps < data_ind_eps\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": 64, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Warning: May not have used enough values of l. Increase 'moments' variable and run again.\n", - "Data Independent Epsilon: 11.756462732485115\n", - "Data Dependent Epsilon: 1.52655213289881\n" - ] - } - ], - "source": [ - "data_dep_eps, data_ind_eps = pate.perform_analysis(teacher_preds=preds, indices=indices, noise_eps=0.1, delta=1e-5)\n", - "print(\"Data Independent Epsilon:\", data_ind_eps)\n", - "print(\"Data Dependent Epsilon:\", data_dep_eps)" - ] - }, - { - "cell_type": "code", - "execution_count": 65, - "metadata": {}, - "outputs": [], - "source": [ - "preds[:,0:50] *= 0" - ] - }, - { - "cell_type": "code", - "execution_count": 66, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Data Independent Epsilon: 11.756462732485115\n", - "Data Dependent Epsilon: 0.9029013677789843\n" - ] - } - ], - "source": [ - "data_dep_eps, data_ind_eps = pate.perform_analysis(teacher_preds=preds, indices=indices, noise_eps=0.1, delta=1e-5, moments=20)\n", - "print(\"Data Independent Epsilon:\", data_ind_eps)\n", - "print(\"Data Dependent Epsilon:\", data_dep_eps)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Where to Go From Here\n", - "\n", - "\n", - "Read:\n", - " - Algorithmic Foundations of Differential Privacy: https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf\n", - " - Deep Learning with Differential Privacy: https://arxiv.org/pdf/1607.00133.pdf\n", - " - The Ethical Algorithm: https://www.amazon.com/Ethical-Algorithm-Science-Socially-Design/dp/0190948205\n", - " \n", - "Topics:\n", - " - The Exponential Mechanism\n", - " - The Moment's Accountant\n", - " - Differentially Private Stochastic Gradient Descent\n", - "\n", - "Advice:\n", - " - For deployments - stick with public frameworks!\n", - " - Join the Differential Privacy Community\n", - " - Don't get ahead of yourself - DP is still in the early days" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Section Project:\n", - "\n", - "For the final project for this section, you're going to train a DP model using this PATE method on the MNIST dataset, provided below." - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "0it [00:00, ?it/s]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - " 99%|█████████▉| 9814016/9912422 [00:13<00:00, 1981975.99it/s]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "\n", - "0it [00:00, ?it/s]\u001b[A" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "\n", - " 0%| | 0/28881 [00:00 Date: Mon, 5 Aug 2019 06:42:01 +0530 Subject: [PATCH 02/14] Add files via upload --- Section_1_Differential_Privacy.ipynb | 1725 ++++++++++++++++++++++++++ 1 file changed, 1725 insertions(+) create mode 100644 Section_1_Differential_Privacy.ipynb diff --git a/Section_1_Differential_Privacy.ipynb b/Section_1_Differential_Privacy.ipynb new file mode 100644 index 0000000..1424d72 --- /dev/null +++ b/Section_1_Differential_Privacy.ipynb @@ -0,0 +1,1725 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + }, + "colab": { + "name": "Section 1 - Differential Privacy.ipynb", + "version": "0.3.2", + "provenance": [] + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "5SmPdWKcNrO7", + "colab_type": "text" + }, + "source": [ + "## Lesson: Toy Differential Privacy - Simple Database Queries" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Gm1VK8ofNrO-", + "colab_type": "text" + }, + "source": [ + "In this section we're going to play around with Differential Privacy in the context of a database query. The database is going to be a VERY simple database with only one boolean column. Each row corresponds to a person. Each value corresponds to whether or not that person has a certain private attribute (such as whether they have a certain disease, or whether they are above/below a certain age). We are then going to learn how to know whether a database query over such a small database is differentially private or not - and more importantly - what techniques are at our disposal to ensure various levels of privacy\n", + "\n", + "\n", + "### First We Create a Simple Database\n", + "\n", + "Step one is to create our database - we're going to do this by initializing a random list of 1s and 0s (which are the entries in our database). Note - the number of entries directly corresponds to the number of people in our database." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "WGTsLkPNNrO_", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "d48b0773-7a68-4f9e-dac2-1990acddcbdc" + }, + "source": [ + "import torch\n", + "\n", + "# the number of entries in our database\n", + "num_entries = 5000\n", + "\n", + "db = torch.rand(num_entries) > 0.5\n", + "db" + ], + "execution_count": 1, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([0, 1, 1, ..., 1, 0, 1], dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 1 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nhJRhUDrNrPF", + "colab_type": "text" + }, + "source": [ + "## Project: Generate Parallel Databases\n", + "\n", + "Key to the definition of differenital privacy is the ability to ask the question \"When querying a database, if I removed someone from the database, would the output of the query be any different?\". Thus, in order to check this, we must construct what we term \"parallel databases\" which are simply databases with one entry removed. \n", + "\n", + "In this first project, I want you to create a list of every parallel database to the one currently contained in the \"db\" variable. Then, I want you to create a function which both:\n", + "\n", + "- creates the initial database (db)\n", + "- creates all parallel databases" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "JJBAPNGNNrPG", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import torch" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "QsZUO3xyNrPK", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + }, + "outputId": "43b51009-64db-44de-b2a2-17e7df5d1ca9" + }, + "source": [ + "!pip install torch\n" + ], + "execution_count": 3, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Requirement already satisfied: torch in /usr/local/lib/python3.6/dist-packages (1.1.0)\n", + "Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from torch) (1.16.4)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "aCs8swrFNrPO", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "0e273874-54a0-4bd3-bf3e-4c20663b321d" + }, + "source": [ + "db = torch.rand(num_entries) > 0.5\n", + "db" + ], + "execution_count": 5, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1, 0, 1, ..., 1, 1, 1], dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 5 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "IxKQz_BDNrPR", + "colab_type": "code", + "colab": {} + }, + "source": [ + "remove_index = 2" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Kz1iFm9FNrPU", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def get_parallel_db(db, remove_index) :\n", + " \n", + " return torch.cat((db[0:remove_index], db[remove_index+1:]))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "AFsIZVt2RFqC", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "05db4e35-6d11-4aba-9515-1ee129cad0f5" + }, + "source": [ + "get_parallel_db(db, 52352) " + ], + "execution_count": 17, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1, 0, 1, ..., 1, 1, 1], dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 17 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "IOCQAp8RSZYZ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def get_parallel_dbs(db) :\n", + "\n", + " parallel_dbs = list()\n", + " \n", + " for i in range(len(db)) :\n", + " pdb = get_parallel_db(db, 1)\n", + " parallel_dbs.append(pdb)\n", + " \n", + " return parallel_dbs" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "7HKGj8DvSaWp", + "colab_type": "code", + "colab": {} + }, + "source": [ + "pdbs = get_parallel_dbs(db)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "1uAFmrb2S5XO", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def create_db_and_parallels(num_entries) :\n", + " \n", + " db = torch.rand(num_entries) > 0.5\n", + " pdbs = get_parallel_dbs(db)\n", + " \n", + " return db, pdbs" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "9IujNF8LS_XR", + "colab_type": "code", + "colab": {} + }, + "source": [ + "db, pdbs = create_db_and_parallels(20)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "rYG1LFeJU6K_", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + }, + "outputId": "6e015aa4-7079-4d2b-9812-89f9e86ef491" + }, + "source": [ + "db" + ], + "execution_count": 27, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 27 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "lpzJBeKgU913", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 777 + }, + "outputId": "e05d6963-774f-4e30-ebb4-9609de0ceb73" + }, + "source": [ + "pdbs" + ], + "execution_count": 28, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "[tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8),\n", + " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", + " dtype=torch.uint8)]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 28 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WNhfE_cJNrPY", + "colab_type": "text" + }, + "source": [ + "# Lesson: Towards Evaluating The Differential Privacy of a Function\n", + "\n", + "Intuitively, we want to be able to query our database and evaluate whether or not the result of the query is leaking \"private\" information. As mentioned previously, this is about evaluating whether the output of a query changes when we remove someone from the database. Specifically, we want to evaluate the *maximum* amount the query changes when someone is removed (maximum over all possible people who could be removed). So, in order to evaluate how much privacy is leaked, we're going to iterate over each person in the database and measure the difference in the output of the query relative to when we query the entire database. \n", + "\n", + "Just for the sake of argument, let's make our first \"database query\" a simple sum. Aka, we're going to count the number of 1s in the database." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "m7wSKyu0NrPZ", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 181 + }, + "outputId": "8f5be76b-b284-480a-8635-6e95fdbbce7f" + }, + "source": [ + "db, pdbs = create_db_and_parallels(5000)" + ], + "execution_count": 4, + "outputs": [ + { + "output_type": "error", + "ename": "NameError", + "evalue": "ignored", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mdb\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mpdbs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcreate_db_and_parallels\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m5000\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mNameError\u001b[0m: name 'create_db_and_parallels' is not defined" + ] + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "PjvnskWANrPe", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def query(db):\n", + " return db.sum()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "V1RneGR-NrPo", + "colab_type": "code", + "colab": {} + }, + "source": [ + "full_db_result = query(db)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "e8ZfuZ-6NrPu", + "colab_type": "code", + "colab": {} + }, + "source": [ + "sensitivity = 0\n", + "for pdb in pdbs:\n", + " pdb_result = query(pdb)\n", + " \n", + " db_distance = torch.abs(pdb_result - full_db_result)\n", + " \n", + " if(db_distance > sensitivity):\n", + " sensitivity = db_distance" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "ZbOjR2BxNrP2", + "colab_type": "code", + "colab": {} + }, + "source": [ + "sensitivity" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nzNvdQE4NrP6", + "colab_type": "text" + }, + "source": [ + "# Project - Evaluating the Privacy of a Function\n", + "\n", + "In the last section, we measured the difference between each parallel db's query result and the query result for the entire database and then calculated the max value (which was 1). This value is called \"sensitivity\", and it corresponds to the function we chose for the query. Namely, the \"sum\" query will always have a sensitivity of exactly 1. However, we can also calculate sensitivity for other functions as well.\n", + "\n", + "Let's try to calculate sensitivity for the \"mean\" function." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "bfzPBl6xNrP7", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# try this project here!" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "gforqWGZNrP-", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "qIKSIIoONrQE", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "pDuXxPWDNrQI", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "evC2zFNGNrQO", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "g53Ir0omNrQU", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "HbUsdjSXNrQX", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "YV3b9NbYNrQa", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Amb03X0LNrQo", + "colab_type": "text" + }, + "source": [ + "Wow! That sensitivity is WAY lower. Note the intuition here. \"Sensitivity\" is measuring how sensitive the output of the query is to a person being removed from the database. For a simple sum, this is always 1, but for the mean, removing a person is going to change the result of the query by rougly 1 divided by the size of the database (which is much smaller). Thus, \"mean\" is a VASTLY less \"sensitive\" function (query) than SUM." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ujLjfLfZNrQt", + "colab_type": "text" + }, + "source": [ + "# Project: Calculate L1 Sensitivity For Threshold\n", + "\n", + "In this first project, I want you to calculate the sensitivty for the \"threshold\" function. \n", + "\n", + "- First compute the sum over the database (i.e. sum(db)) and return whether that sum is greater than a certain threshold.\n", + "- Then, I want you to create databases of size 10 and threshold of 5 and calculate the sensitivity of the function. \n", + "- Finally, re-initialize the database 10 times and calculate the sensitivity each time." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "NdoBSl7UNrQv", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# try this project here!" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "rkL2cW8INrQ_", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "hwFy3MJXNrRF", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "iXOLrclGNrRJ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "EYDeItdmNrRP", + "colab_type": "text" + }, + "source": [ + "# Lesson: A Basic Differencing Attack\n", + "\n", + "Sadly none of the functions we've looked at so far are differentially private (despite them having varying levels of sensitivity). The most basic type of attack can be done as follows.\n", + "\n", + "Let's say we wanted to figure out a specific person's value in the database. All we would have to do is query for the sum of the entire database and then the sum of the entire database without that person!\n", + "\n", + "# Project: Perform a Differencing Attack on Row 10\n", + "\n", + "In this project, I want you to construct a database and then demonstrate how you can use two different sum queries to explose the value of the person represented by row 10 in the database (note, you'll need to use a database with at least 10 rows)" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "b7iBppCDNrRQ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# try this project here!" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "-zt1hTwvNrRU", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "_8J6Bt3ONrRX", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "VM8ZkQ62NrRb", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "R11dyl7gNrRe", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "bw_5CeM6NrRi", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "UpgIeau6NrRm", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "VEnn4sr8NrRr", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fg6FPpcnNrRx", + "colab_type": "text" + }, + "source": [ + "# Project: Local Differential Privacy\n", + "\n", + "As you can see, the basic sum query is not differentially private at all! In truth, differential privacy always requires a form of randomness added to the query. Let me show you what I mean.\n", + "\n", + "### Randomized Response (Local Differential Privacy)\n", + "\n", + "Let's say I have a group of people I wish to survey about a very taboo behavior which I think they will lie about (say, I want to know if they have ever committed a certain kind of crime). I'm not a policeman, I'm just trying to collect statistics to understand the higher level trend in society. So, how do we do this? One technique is to add randomness to each person's response by giving each person the following instructions (assuming I'm asking a simple yes/no question):\n", + "\n", + "- Flip a coin 2 times.\n", + "- If the first coin flip is heads, answer honestly\n", + "- If the first coin flip is tails, answer according to the second coin flip (heads for yes, tails for no)!\n", + "\n", + "Thus, each person is now protected with \"plausible deniability\". If they answer \"Yes\" to the question \"have you committed X crime?\", then it might becasue they actually did, or it might be becasue they are answering according to a random coin flip. Each person has a high degree of protection. Furthermore, we can recover the underlying statistics with some accuracy, as the \"true statistics\" are simply averaged with a 50% probability. Thus, if we collect a bunch of samples and it turns out that 60% of people answer yes, then we know that the TRUE distribution is actually centered around 70%, because 70% averaged wtih 50% (a coin flip) is 60% which is the result we obtained. \n", + "\n", + "However, it should be noted that, especially when we only have a few samples, this comes at the cost of accuracy. This tradeoff exists across all of Differential Privacy. The greater the privacy protection (plausible deniability) the less accurate the results. \n", + "\n", + "Let's implement this local DP for our database before!" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "k5IV99F8NrRy", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# try this project here!" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "x_DCCq81NrR5", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "3kTgbGEnNrR9", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "h3gQASKuNrSB", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "UGo0wEFKNrSD", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "MyYXDeDhNrSH", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "n9EXjcX-NrSL", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "xUCLwJObNrSO", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "kNwpUP7HNrSQ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "cOm1aBq8NrST", + "colab_type": "text" + }, + "source": [ + "# Project: Varying Amounts of Noise\n", + "\n", + "In this project, I want you to augment the randomized response query (the one we just wrote) to allow for varying amounts of randomness to be added. Specifically, I want you to bias the coin flip to be higher or lower and then run the same experiment. \n", + "\n", + "Note - this one is a bit tricker than you might expect. You need to both adjust the likelihood of the first coin flip AND the de-skewing at the end (where we create the \"augmented_result\" variable)." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "AUKLRajhNrSV", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# try this project here!" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "DPTScgmQNrSX", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "7cLnh8ryNrSZ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "bOTnIAQSNrSe", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "CmxohDVBNrSn", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "KpI8-sx1NrSr", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "VMGvYTRbNrSu", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "GiZr7oIiNrS5", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "cS7J1evZNrS9", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "nqOIsCIHNrS_", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FRqBEo82NrTE", + "colab_type": "text" + }, + "source": [ + "# Lesson: The Formal Definition of Differential Privacy\n", + "\n", + "The previous method of adding noise was called \"Local Differentail Privacy\" because we added noise to each datapoint individually. This is necessary for some situations wherein the data is SO sensitive that individuals do not trust noise to be added later. However, it comes at a very high cost in terms of accuracy. \n", + "\n", + "However, alternatively we can add noise AFTER data has been aggregated by a function. This kind of noise can allow for similar levels of protection with a lower affect on accuracy. However, participants must be able to trust that no-one looked at their datapoints _before_ the aggregation took place. In some situations this works out well, in others (such as an individual hand-surveying a group of people), this is less realistic.\n", + "\n", + "Nevertheless, global differential privacy is incredibly important because it allows us to perform differential privacy on smaller groups of individuals with lower amounts of noise. Let's revisit our sum functions." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "nQwYWNZcNrTE", + "colab_type": "code", + "colab": {} + }, + "source": [ + "db, pdbs = create_db_and_parallels(100)\n", + "\n", + "def query(db):\n", + " return torch.sum(db.float())\n", + "\n", + "def M(db):\n", + " query(db) + noise\n", + "\n", + "query(db)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6Qvj7T79NrTH", + "colab_type": "text" + }, + "source": [ + "So the idea here is that we want to add noise to the output of our function. We actually have two different kinds of noise we can add - Laplacian Noise or Gaussian Noise. However, before we do so at this point we need to dive into the formal definition of Differential Privacy.\n", + "\n", + "![alt text](dp_formula.png \"Title\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9Gh9nZDhNrTI", + "colab_type": "text" + }, + "source": [ + "_Image From: \"The Algorithmic Foundations of Differential Privacy\" - Cynthia Dwork and Aaron Roth - https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf_" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "D73LS5IdNrTJ", + "colab_type": "text" + }, + "source": [ + "This definition does not _create_ differential privacy, instead it is a measure of how much privacy is afforded by a query M. Specifically, it's a comparison between running the query M on a database (x) and a parallel database (y). As you remember, parallel databases are defined to be the same as a full database (x) with one entry/person removed.\n", + "\n", + "Thus, this definition says that FOR ALL parallel databases, the maximum distance between a query on database (x) and the same query on database (y) will be e^epsilon, but that occasionally this constraint won't hold with probability delta. Thus, this theorem is called \"epsilon delta\" differential privacy.\n", + "\n", + "# Epsilon\n", + "\n", + "Let's unpack the intuition of this for a moment. \n", + "\n", + "Epsilon Zero: If a query satisfied this inequality where epsilon was set to 0, then that would mean that the query for all parallel databases outputed the exact same value as the full database. As you may remember, when we calculated the \"threshold\" function, often the Sensitivity was 0. In that case, the epsilon also happened to be zero.\n", + "\n", + "Epsilon One: If a query satisfied this inequality with epsilon 1, then the maximum distance between all queries would be 1 - or more precisely - the maximum distance between the two random distributions M(x) and M(y) is 1 (because all these queries have some amount of randomness in them, just like we observed in the last section).\n", + "\n", + "# Delta\n", + "\n", + "Delta is basically the probability that epsilon breaks. Namely, sometimes the epsilon is different for some queries than it is for others. For example, you may remember when we were calculating the sensitivity of threshold, most of the time sensitivity was 0 but sometimes it was 1. Thus, we could calculate this as \"epsilon zero but non-zero delta\" which would say that epsilon is perfect except for some probability of the time when it's arbitrarily higher. Note that this expression doesn't represent the full tradeoff between epsilon and delta." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1CH6FPP0NrTK", + "colab_type": "text" + }, + "source": [ + "# Lesson: How To Add Noise for Global Differential Privacy\n", + "\n", + "In this lesson, we're going to learn about how to take a query and add varying amounts of noise so that it satisfies a certain degree of differential privacy. In particular, we're going to leave behind the Local Differential privacy previously discussed and instead opt to focus on Global differential privacy. \n", + "\n", + "So, to sum up, this lesson is about adding noise to the output of our query so that it satisfies a certain epsilon-delta differential privacy threshold.\n", + "\n", + "There are two kinds of noise we can add - Gaussian Noise or Laplacian Noise. Generally speaking Laplacian is better, but both are still valid. Now to the hard question...\n", + "\n", + "### How much noise should we add?\n", + "\n", + "The amount of noise necessary to add to the output of a query is a function of four things:\n", + "\n", + "- the type of noise (Gaussian/Laplacian)\n", + "- the sensitivity of the query/function\n", + "- the desired epsilon (ε)\n", + "- the desired delta (δ)\n", + "\n", + "Thus, for each type of noise we're adding, we have different way of calculating how much to add as a function of sensitivity, epsilon, and delta. We're going to focus on Laplacian noise. Laplacian noise is increased/decreased according to a \"scale\" parameter b. We choose \"b\" based on the following formula.\n", + "\n", + "b = sensitivity(query) / epsilon\n", + "\n", + "In other words, if we set b to be this value, then we know that we will have a privacy leakage of <= epsilon. Furthermore, the nice thing about Laplace is that it guarantees this with delta == 0. There are some tunings where we can have very low epsilon where delta is non-zero, but we'll ignore them for now.\n", + "\n", + "### Querying Repeatedly\n", + "\n", + "- if we query the database multiple times - we can simply add the epsilons (Even if we change the amount of noise and their epsilons are not the same)." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "kWXSly0zNrTK", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "n1J2kT3SNrTN", + "colab_type": "text" + }, + "source": [ + "# Project: Create a Differentially Private Query\n", + "\n", + "In this project, I want you to take what you learned in the previous lesson and create a query function which sums over the database and adds just the right amount of noise such that it satisfies an epsilon constraint. Write a query for both \"sum\" and for \"mean\". Ensure that you use the correct sensitivity measures for both." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "AjtzGiLBNrTN", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# try this project here!" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "lznEr0BONrTQ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Goc0emOKNrTT", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "3gWoxTewNrTd", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "o1kv4rPINrTj", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Lko_2XOUNrTn", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Ij6Zh3WUNrTr", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "b3p0xznUNrTy", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "S-yiuwr3NrT5", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "bv2Z7i8bNrT-", + "colab_type": "text" + }, + "source": [ + "# Lesson: Differential Privacy for Deep Learning\n", + "\n", + "So in the last lessons you may have been wondering - what does all of this have to do with Deep Learning? Well, these same techniques we were just studying form the core primitives for how Differential Privacy provides guarantees in the context of Deep Learning. \n", + "\n", + "Previously, we defined perfect privacy as \"a query to a database returns the same value even if we remove any person from the database\", and used this intuition in the description of epsilon/delta. In the context of deep learning we have a similar standard.\n", + "\n", + "Training a model on a dataset should return the same model even if we remove any person from the dataset.\n", + "\n", + "Thus, we've replaced \"querying a database\" with \"training a model on a dataset\". In essence, the training process is a kind of query. However, one should note that this adds two points of complexity which database queries did not have:\n", + "\n", + " 1. do we always know where \"people\" are referenced in the dataset?\n", + " 2. neural models rarely never train to the same output model, even on identical data\n", + "\n", + "The answer to (1) is to treat each training example as a single, separate person. Strictly speaking, this is often overly zealous as some training examples have no relevance to people and others may have multiple/partial (consider an image with multiple people contained within it). Thus, localizing exactly where \"people\" are referenced, and thus how much your model would change if people were removed, is challenging.\n", + "\n", + "The answer to (2) is also an open problem - but several interesitng proposals have been made. We're going to focus on one of the most popular proposals, PATE.\n", + "\n", + "## An Example Scenario: A Health Neural Network\n", + "\n", + "First we're going to consider a scenario - you work for a hospital and you have a large collection of images about your patients. However, you don't know what's in them. You would like to use these images to develop a neural network which can automatically classify them, however since your images aren't labeled, they aren't sufficient to train a classifier. \n", + "\n", + "However, being a cunning strategist, you realize that you can reach out to 10 partner hospitals which DO have annotated data. It is your hope to train your new classifier on their datasets so that you can automatically label your own. While these hospitals are interested in helping, they have privacy concerns regarding information about their patients. Thus, you will use the following technique to train a classifier which protects the privacy of patients in the other hospitals.\n", + "\n", + "- 1) You'll ask each of the 10 hospitals to train a model on their own datasets (All of which have the same kinds of labels)\n", + "- 2) You'll then use each of the 10 partner models to predict on your local dataset, generating 10 labels for each of your datapoints\n", + "- 3) Then, for each local data point (now with 10 labels), you will perform a DP query to generate the final true label. This query is a \"max\" function, where \"max\" is the most frequent label across the 10 labels. We will need to add laplacian noise to make this Differentially Private to a certain epsilon/delta constraint.\n", + "- 4) Finally, we will retrain a new model on our local dataset which now has labels. This will be our final \"DP\" model.\n", + "\n", + "So, let's walk through these steps. I will assume you're already familiar with how to train/predict a deep neural network, so we'll skip steps 1 and 2 and work with example data. We'll focus instead on step 3, namely how to perform the DP query for each example using toy data.\n", + "\n", + "So, let's say we have 10,000 training examples, and we've got 10 labels for each example (from our 10 \"teacher models\" which were trained directly on private data). Each label is chosen from a set of 10 possible labels (categories) for each image." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "30skLUk2NrUA", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import numpy as np" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "q3tV7jL4NrUE", + "colab_type": "code", + "colab": {} + }, + "source": [ + "num_teachers = 10 # we're working with 10 partner hospitals\n", + "num_examples = 10000 # the size of OUR dataset\n", + "num_labels = 10 # number of lablels for our classifier" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "IaT_f1zKNrUJ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "preds = (np.random.rand(num_teachers, num_examples) * num_labels).astype(int).transpose(1,0) # fake predictions" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "nt38zWhYNrUW", + "colab_type": "code", + "colab": {} + }, + "source": [ + "new_labels = list()\n", + "for an_image in preds:\n", + "\n", + " label_counts = np.bincount(an_image, minlength=num_labels)\n", + "\n", + " epsilon = 0.1\n", + " beta = 1 / epsilon\n", + "\n", + " for i in range(len(label_counts)):\n", + " label_counts[i] += np.random.laplace(0, beta, 1)\n", + "\n", + " new_label = np.argmax(label_counts)\n", + " \n", + " new_labels.append(new_label)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "xc41XXHDNrUo", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# new_labels" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "u52N4rs-NrU0", + "colab_type": "text" + }, + "source": [ + "# PATE Analysis" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "6-tQ3PAANrU9", + "colab_type": "code", + "colab": {} + }, + "source": [ + "labels = np.array([9, 9, 3, 6, 9, 9, 9, 9, 8, 2])\n", + "counts = np.bincount(labels, minlength=10)\n", + "query_result = np.argmax(counts)\n", + "query_result" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "G_Ui-1O3NrVA", + "colab_type": "code", + "colab": {} + }, + "source": [ + "from syft.frameworks.torch.differential_privacy import pate" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "enf7GqbBNrVF", + "colab_type": "code", + "colab": {} + }, + "source": [ + "num_teachers, num_examples, num_labels = (100, 100, 10)\n", + "preds = (np.random.rand(num_teachers, num_examples) * num_labels).astype(int) #fake preds\n", + "indices = (np.random.rand(num_examples) * num_labels).astype(int) # true answers\n", + "\n", + "preds[:,0:10] *= 0\n", + "\n", + "data_dep_eps, data_ind_eps = pate.perform_analysis(teacher_preds=preds, indices=indices, noise_eps=0.1, delta=1e-5)\n", + "\n", + "assert data_dep_eps < data_ind_eps\n", + "\n" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "rtcGSVu_NrVM", + "colab_type": "code", + "colab": {} + }, + "source": [ + "data_dep_eps, data_ind_eps = pate.perform_analysis(teacher_preds=preds, indices=indices, noise_eps=0.1, delta=1e-5)\n", + "print(\"Data Independent Epsilon:\", data_ind_eps)\n", + "print(\"Data Dependent Epsilon:\", data_dep_eps)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "54GPud2ENrVU", + "colab_type": "code", + "colab": {} + }, + "source": [ + "preds[:,0:50] *= 0" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "fvxVNNEMNrVY", + "colab_type": "code", + "colab": {} + }, + "source": [ + "data_dep_eps, data_ind_eps = pate.perform_analysis(teacher_preds=preds, indices=indices, noise_eps=0.1, delta=1e-5, moments=20)\n", + "print(\"Data Independent Epsilon:\", data_ind_eps)\n", + "print(\"Data Dependent Epsilon:\", data_dep_eps)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "RY-gAWL9NrVd", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BmlFIVPaNrVk", + "colab_type": "text" + }, + "source": [ + "# Where to Go From Here\n", + "\n", + "\n", + "Read:\n", + " - Algorithmic Foundations of Differential Privacy: https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf\n", + " - Deep Learning with Differential Privacy: https://arxiv.org/pdf/1607.00133.pdf\n", + " - The Ethical Algorithm: https://www.amazon.com/Ethical-Algorithm-Science-Socially-Design/dp/0190948205\n", + " \n", + "Topics:\n", + " - The Exponential Mechanism\n", + " - The Moment's Accountant\n", + " - Differentially Private Stochastic Gradient Descent\n", + "\n", + "Advice:\n", + " - For deployments - stick with public frameworks!\n", + " - Join the Differential Privacy Community\n", + " - Don't get ahead of yourself - DP is still in the early days" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "wa9mH1R7NrVn", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "QhSsS_zsNrVz", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Q8GdrZB1NrV2", + "colab_type": "text" + }, + "source": [ + "# Section Project:\n", + "\n", + "For the final project for this section, you're going to train a DP model using this PATE method on the MNIST dataset, provided below." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "BsblHz0vNrV3", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import torchvision.datasets as datasets\n", + "mnist_trainset = datasets.MNIST(root='./data', train=True, download=True, transform=None)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "ocrWt2AtNrWC", + "colab_type": "code", + "colab": {} + }, + "source": [ + "train_data = mnist_trainset.train_data\n", + "train_targets = mnist_trainset.train_labels" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "31jgZwaNNrWL", + "colab_type": "code", + "colab": {} + }, + "source": [ + "test_data = mnist_trainset.test_data\n", + "test_targets = mnist_trainset.test_labels" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "6gC9xD2hNrWU", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + } + ] +} \ No newline at end of file From bd3a696025fc1290de485c3f80c12391f90ee9d4 Mon Sep 17 00:00:00 2001 From: Nimanshi Jha <40792134+nimanshijha@users.noreply.github.com> Date: Wed, 21 Aug 2019 17:33:19 +0530 Subject: [PATCH 03/14] Delete Section_1_Differential_Privacy.ipynb --- Section_1_Differential_Privacy.ipynb | 1725 -------------------------- 1 file changed, 1725 deletions(-) delete mode 100644 Section_1_Differential_Privacy.ipynb diff --git a/Section_1_Differential_Privacy.ipynb b/Section_1_Differential_Privacy.ipynb deleted file mode 100644 index 1424d72..0000000 --- a/Section_1_Differential_Privacy.ipynb +++ /dev/null @@ -1,1725 +0,0 @@ -{ - "nbformat": 4, - "nbformat_minor": 0, - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.3" - }, - "colab": { - "name": "Section 1 - Differential Privacy.ipynb", - "version": "0.3.2", - "provenance": [] - } - }, - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "id": "5SmPdWKcNrO7", - "colab_type": "text" - }, - "source": [ - "## Lesson: Toy Differential Privacy - Simple Database Queries" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Gm1VK8ofNrO-", - "colab_type": "text" - }, - "source": [ - "In this section we're going to play around with Differential Privacy in the context of a database query. The database is going to be a VERY simple database with only one boolean column. Each row corresponds to a person. Each value corresponds to whether or not that person has a certain private attribute (such as whether they have a certain disease, or whether they are above/below a certain age). We are then going to learn how to know whether a database query over such a small database is differentially private or not - and more importantly - what techniques are at our disposal to ensure various levels of privacy\n", - "\n", - "\n", - "### First We Create a Simple Database\n", - "\n", - "Step one is to create our database - we're going to do this by initializing a random list of 1s and 0s (which are the entries in our database). Note - the number of entries directly corresponds to the number of people in our database." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "WGTsLkPNNrO_", - "colab_type": "code", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 36 - }, - "outputId": "d48b0773-7a68-4f9e-dac2-1990acddcbdc" - }, - "source": [ - "import torch\n", - "\n", - "# the number of entries in our database\n", - "num_entries = 5000\n", - "\n", - "db = torch.rand(num_entries) > 0.5\n", - "db" - ], - "execution_count": 1, - "outputs": [ - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "tensor([0, 1, 1, ..., 1, 0, 1], dtype=torch.uint8)" - ] - }, - "metadata": { - "tags": [] - }, - "execution_count": 1 - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "nhJRhUDrNrPF", - "colab_type": "text" - }, - "source": [ - "## Project: Generate Parallel Databases\n", - "\n", - "Key to the definition of differenital privacy is the ability to ask the question \"When querying a database, if I removed someone from the database, would the output of the query be any different?\". Thus, in order to check this, we must construct what we term \"parallel databases\" which are simply databases with one entry removed. \n", - "\n", - "In this first project, I want you to create a list of every parallel database to the one currently contained in the \"db\" variable. Then, I want you to create a function which both:\n", - "\n", - "- creates the initial database (db)\n", - "- creates all parallel databases" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "JJBAPNGNNrPG", - "colab_type": "code", - "colab": {} - }, - "source": [ - "import torch" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "QsZUO3xyNrPK", - "colab_type": "code", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 55 - }, - "outputId": "43b51009-64db-44de-b2a2-17e7df5d1ca9" - }, - "source": [ - "!pip install torch\n" - ], - "execution_count": 3, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Requirement already satisfied: torch in /usr/local/lib/python3.6/dist-packages (1.1.0)\n", - "Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from torch) (1.16.4)\n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "aCs8swrFNrPO", - "colab_type": "code", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 36 - }, - "outputId": "0e273874-54a0-4bd3-bf3e-4c20663b321d" - }, - "source": [ - "db = torch.rand(num_entries) > 0.5\n", - "db" - ], - "execution_count": 5, - "outputs": [ - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "tensor([1, 0, 1, ..., 1, 1, 1], dtype=torch.uint8)" - ] - }, - "metadata": { - "tags": [] - }, - "execution_count": 5 - } - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "IxKQz_BDNrPR", - "colab_type": "code", - "colab": {} - }, - "source": [ - "remove_index = 2" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "Kz1iFm9FNrPU", - "colab_type": "code", - "colab": {} - }, - "source": [ - "def get_parallel_db(db, remove_index) :\n", - " \n", - " return torch.cat((db[0:remove_index], db[remove_index+1:]))" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "AFsIZVt2RFqC", - "colab_type": "code", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 36 - }, - "outputId": "05db4e35-6d11-4aba-9515-1ee129cad0f5" - }, - "source": [ - "get_parallel_db(db, 52352) " - ], - "execution_count": 17, - "outputs": [ - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "tensor([1, 0, 1, ..., 1, 1, 1], dtype=torch.uint8)" - ] - }, - "metadata": { - "tags": [] - }, - "execution_count": 17 - } - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "IOCQAp8RSZYZ", - "colab_type": "code", - "colab": {} - }, - "source": [ - "def get_parallel_dbs(db) :\n", - "\n", - " parallel_dbs = list()\n", - " \n", - " for i in range(len(db)) :\n", - " pdb = get_parallel_db(db, 1)\n", - " parallel_dbs.append(pdb)\n", - " \n", - " return parallel_dbs" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "7HKGj8DvSaWp", - "colab_type": "code", - "colab": {} - }, - "source": [ - "pdbs = get_parallel_dbs(db)" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "1uAFmrb2S5XO", - "colab_type": "code", - "colab": {} - }, - "source": [ - "def create_db_and_parallels(num_entries) :\n", - " \n", - " db = torch.rand(num_entries) > 0.5\n", - " pdbs = get_parallel_dbs(db)\n", - " \n", - " return db, pdbs" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "9IujNF8LS_XR", - "colab_type": "code", - "colab": {} - }, - "source": [ - "db, pdbs = create_db_and_parallels(20)" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "rYG1LFeJU6K_", - "colab_type": "code", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 55 - }, - "outputId": "6e015aa4-7079-4d2b-9812-89f9e86ef491" - }, - "source": [ - "db" - ], - "execution_count": 27, - "outputs": [ - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "tensor([1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8)" - ] - }, - "metadata": { - "tags": [] - }, - "execution_count": 27 - } - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "lpzJBeKgU913", - "colab_type": "code", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 777 - }, - "outputId": "e05d6963-774f-4e30-ebb4-9609de0ceb73" - }, - "source": [ - "pdbs" - ], - "execution_count": 28, - "outputs": [ - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "[tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8),\n", - " tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0],\n", - " dtype=torch.uint8)]" - ] - }, - "metadata": { - "tags": [] - }, - "execution_count": 28 - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "WNhfE_cJNrPY", - "colab_type": "text" - }, - "source": [ - "# Lesson: Towards Evaluating The Differential Privacy of a Function\n", - "\n", - "Intuitively, we want to be able to query our database and evaluate whether or not the result of the query is leaking \"private\" information. As mentioned previously, this is about evaluating whether the output of a query changes when we remove someone from the database. Specifically, we want to evaluate the *maximum* amount the query changes when someone is removed (maximum over all possible people who could be removed). So, in order to evaluate how much privacy is leaked, we're going to iterate over each person in the database and measure the difference in the output of the query relative to when we query the entire database. \n", - "\n", - "Just for the sake of argument, let's make our first \"database query\" a simple sum. Aka, we're going to count the number of 1s in the database." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "m7wSKyu0NrPZ", - "colab_type": "code", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 181 - }, - "outputId": "8f5be76b-b284-480a-8635-6e95fdbbce7f" - }, - "source": [ - "db, pdbs = create_db_and_parallels(5000)" - ], - "execution_count": 4, - "outputs": [ - { - "output_type": "error", - "ename": "NameError", - "evalue": "ignored", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mdb\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mpdbs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcreate_db_and_parallels\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m5000\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;31mNameError\u001b[0m: name 'create_db_and_parallels' is not defined" - ] - } - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "PjvnskWANrPe", - "colab_type": "code", - "colab": {} - }, - "source": [ - "def query(db):\n", - " return db.sum()" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "V1RneGR-NrPo", - "colab_type": "code", - "colab": {} - }, - "source": [ - "full_db_result = query(db)" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "e8ZfuZ-6NrPu", - "colab_type": "code", - "colab": {} - }, - "source": [ - "sensitivity = 0\n", - "for pdb in pdbs:\n", - " pdb_result = query(pdb)\n", - " \n", - " db_distance = torch.abs(pdb_result - full_db_result)\n", - " \n", - " if(db_distance > sensitivity):\n", - " sensitivity = db_distance" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "ZbOjR2BxNrP2", - "colab_type": "code", - "colab": {} - }, - "source": [ - "sensitivity" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "nzNvdQE4NrP6", - "colab_type": "text" - }, - "source": [ - "# Project - Evaluating the Privacy of a Function\n", - "\n", - "In the last section, we measured the difference between each parallel db's query result and the query result for the entire database and then calculated the max value (which was 1). This value is called \"sensitivity\", and it corresponds to the function we chose for the query. Namely, the \"sum\" query will always have a sensitivity of exactly 1. However, we can also calculate sensitivity for other functions as well.\n", - "\n", - "Let's try to calculate sensitivity for the \"mean\" function." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "bfzPBl6xNrP7", - "colab_type": "code", - "colab": {} - }, - "source": [ - "# try this project here!" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "gforqWGZNrP-", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "qIKSIIoONrQE", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "pDuXxPWDNrQI", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "evC2zFNGNrQO", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "g53Ir0omNrQU", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "HbUsdjSXNrQX", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "YV3b9NbYNrQa", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Amb03X0LNrQo", - "colab_type": "text" - }, - "source": [ - "Wow! That sensitivity is WAY lower. Note the intuition here. \"Sensitivity\" is measuring how sensitive the output of the query is to a person being removed from the database. For a simple sum, this is always 1, but for the mean, removing a person is going to change the result of the query by rougly 1 divided by the size of the database (which is much smaller). Thus, \"mean\" is a VASTLY less \"sensitive\" function (query) than SUM." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "ujLjfLfZNrQt", - "colab_type": "text" - }, - "source": [ - "# Project: Calculate L1 Sensitivity For Threshold\n", - "\n", - "In this first project, I want you to calculate the sensitivty for the \"threshold\" function. \n", - "\n", - "- First compute the sum over the database (i.e. sum(db)) and return whether that sum is greater than a certain threshold.\n", - "- Then, I want you to create databases of size 10 and threshold of 5 and calculate the sensitivity of the function. \n", - "- Finally, re-initialize the database 10 times and calculate the sensitivity each time." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "NdoBSl7UNrQv", - "colab_type": "code", - "colab": {} - }, - "source": [ - "# try this project here!" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "rkL2cW8INrQ_", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "hwFy3MJXNrRF", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "iXOLrclGNrRJ", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "EYDeItdmNrRP", - "colab_type": "text" - }, - "source": [ - "# Lesson: A Basic Differencing Attack\n", - "\n", - "Sadly none of the functions we've looked at so far are differentially private (despite them having varying levels of sensitivity). The most basic type of attack can be done as follows.\n", - "\n", - "Let's say we wanted to figure out a specific person's value in the database. All we would have to do is query for the sum of the entire database and then the sum of the entire database without that person!\n", - "\n", - "# Project: Perform a Differencing Attack on Row 10\n", - "\n", - "In this project, I want you to construct a database and then demonstrate how you can use two different sum queries to explose the value of the person represented by row 10 in the database (note, you'll need to use a database with at least 10 rows)" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "b7iBppCDNrRQ", - "colab_type": "code", - "colab": {} - }, - "source": [ - "# try this project here!" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "-zt1hTwvNrRU", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "_8J6Bt3ONrRX", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "VM8ZkQ62NrRb", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "R11dyl7gNrRe", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "bw_5CeM6NrRi", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "UpgIeau6NrRm", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "VEnn4sr8NrRr", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "fg6FPpcnNrRx", - "colab_type": "text" - }, - "source": [ - "# Project: Local Differential Privacy\n", - "\n", - "As you can see, the basic sum query is not differentially private at all! In truth, differential privacy always requires a form of randomness added to the query. Let me show you what I mean.\n", - "\n", - "### Randomized Response (Local Differential Privacy)\n", - "\n", - "Let's say I have a group of people I wish to survey about a very taboo behavior which I think they will lie about (say, I want to know if they have ever committed a certain kind of crime). I'm not a policeman, I'm just trying to collect statistics to understand the higher level trend in society. So, how do we do this? One technique is to add randomness to each person's response by giving each person the following instructions (assuming I'm asking a simple yes/no question):\n", - "\n", - "- Flip a coin 2 times.\n", - "- If the first coin flip is heads, answer honestly\n", - "- If the first coin flip is tails, answer according to the second coin flip (heads for yes, tails for no)!\n", - "\n", - "Thus, each person is now protected with \"plausible deniability\". If they answer \"Yes\" to the question \"have you committed X crime?\", then it might becasue they actually did, or it might be becasue they are answering according to a random coin flip. Each person has a high degree of protection. Furthermore, we can recover the underlying statistics with some accuracy, as the \"true statistics\" are simply averaged with a 50% probability. Thus, if we collect a bunch of samples and it turns out that 60% of people answer yes, then we know that the TRUE distribution is actually centered around 70%, because 70% averaged wtih 50% (a coin flip) is 60% which is the result we obtained. \n", - "\n", - "However, it should be noted that, especially when we only have a few samples, this comes at the cost of accuracy. This tradeoff exists across all of Differential Privacy. The greater the privacy protection (plausible deniability) the less accurate the results. \n", - "\n", - "Let's implement this local DP for our database before!" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "k5IV99F8NrRy", - "colab_type": "code", - "colab": {} - }, - "source": [ - "# try this project here!" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "x_DCCq81NrR5", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "3kTgbGEnNrR9", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "h3gQASKuNrSB", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "UGo0wEFKNrSD", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "MyYXDeDhNrSH", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "n9EXjcX-NrSL", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "xUCLwJObNrSO", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "kNwpUP7HNrSQ", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "cOm1aBq8NrST", - "colab_type": "text" - }, - "source": [ - "# Project: Varying Amounts of Noise\n", - "\n", - "In this project, I want you to augment the randomized response query (the one we just wrote) to allow for varying amounts of randomness to be added. Specifically, I want you to bias the coin flip to be higher or lower and then run the same experiment. \n", - "\n", - "Note - this one is a bit tricker than you might expect. You need to both adjust the likelihood of the first coin flip AND the de-skewing at the end (where we create the \"augmented_result\" variable)." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "AUKLRajhNrSV", - "colab_type": "code", - "colab": {} - }, - "source": [ - "# try this project here!" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "DPTScgmQNrSX", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "7cLnh8ryNrSZ", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "bOTnIAQSNrSe", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "CmxohDVBNrSn", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "KpI8-sx1NrSr", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "VMGvYTRbNrSu", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "GiZr7oIiNrS5", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "cS7J1evZNrS9", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "nqOIsCIHNrS_", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "FRqBEo82NrTE", - "colab_type": "text" - }, - "source": [ - "# Lesson: The Formal Definition of Differential Privacy\n", - "\n", - "The previous method of adding noise was called \"Local Differentail Privacy\" because we added noise to each datapoint individually. This is necessary for some situations wherein the data is SO sensitive that individuals do not trust noise to be added later. However, it comes at a very high cost in terms of accuracy. \n", - "\n", - "However, alternatively we can add noise AFTER data has been aggregated by a function. This kind of noise can allow for similar levels of protection with a lower affect on accuracy. However, participants must be able to trust that no-one looked at their datapoints _before_ the aggregation took place. In some situations this works out well, in others (such as an individual hand-surveying a group of people), this is less realistic.\n", - "\n", - "Nevertheless, global differential privacy is incredibly important because it allows us to perform differential privacy on smaller groups of individuals with lower amounts of noise. Let's revisit our sum functions." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "nQwYWNZcNrTE", - "colab_type": "code", - "colab": {} - }, - "source": [ - "db, pdbs = create_db_and_parallels(100)\n", - "\n", - "def query(db):\n", - " return torch.sum(db.float())\n", - "\n", - "def M(db):\n", - " query(db) + noise\n", - "\n", - "query(db)" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "6Qvj7T79NrTH", - "colab_type": "text" - }, - "source": [ - "So the idea here is that we want to add noise to the output of our function. We actually have two different kinds of noise we can add - Laplacian Noise or Gaussian Noise. However, before we do so at this point we need to dive into the formal definition of Differential Privacy.\n", - "\n", - "![alt text](dp_formula.png \"Title\")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "9Gh9nZDhNrTI", - "colab_type": "text" - }, - "source": [ - "_Image From: \"The Algorithmic Foundations of Differential Privacy\" - Cynthia Dwork and Aaron Roth - https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf_" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "D73LS5IdNrTJ", - "colab_type": "text" - }, - "source": [ - "This definition does not _create_ differential privacy, instead it is a measure of how much privacy is afforded by a query M. Specifically, it's a comparison between running the query M on a database (x) and a parallel database (y). As you remember, parallel databases are defined to be the same as a full database (x) with one entry/person removed.\n", - "\n", - "Thus, this definition says that FOR ALL parallel databases, the maximum distance between a query on database (x) and the same query on database (y) will be e^epsilon, but that occasionally this constraint won't hold with probability delta. Thus, this theorem is called \"epsilon delta\" differential privacy.\n", - "\n", - "# Epsilon\n", - "\n", - "Let's unpack the intuition of this for a moment. \n", - "\n", - "Epsilon Zero: If a query satisfied this inequality where epsilon was set to 0, then that would mean that the query for all parallel databases outputed the exact same value as the full database. As you may remember, when we calculated the \"threshold\" function, often the Sensitivity was 0. In that case, the epsilon also happened to be zero.\n", - "\n", - "Epsilon One: If a query satisfied this inequality with epsilon 1, then the maximum distance between all queries would be 1 - or more precisely - the maximum distance between the two random distributions M(x) and M(y) is 1 (because all these queries have some amount of randomness in them, just like we observed in the last section).\n", - "\n", - "# Delta\n", - "\n", - "Delta is basically the probability that epsilon breaks. Namely, sometimes the epsilon is different for some queries than it is for others. For example, you may remember when we were calculating the sensitivity of threshold, most of the time sensitivity was 0 but sometimes it was 1. Thus, we could calculate this as \"epsilon zero but non-zero delta\" which would say that epsilon is perfect except for some probability of the time when it's arbitrarily higher. Note that this expression doesn't represent the full tradeoff between epsilon and delta." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "1CH6FPP0NrTK", - "colab_type": "text" - }, - "source": [ - "# Lesson: How To Add Noise for Global Differential Privacy\n", - "\n", - "In this lesson, we're going to learn about how to take a query and add varying amounts of noise so that it satisfies a certain degree of differential privacy. In particular, we're going to leave behind the Local Differential privacy previously discussed and instead opt to focus on Global differential privacy. \n", - "\n", - "So, to sum up, this lesson is about adding noise to the output of our query so that it satisfies a certain epsilon-delta differential privacy threshold.\n", - "\n", - "There are two kinds of noise we can add - Gaussian Noise or Laplacian Noise. Generally speaking Laplacian is better, but both are still valid. Now to the hard question...\n", - "\n", - "### How much noise should we add?\n", - "\n", - "The amount of noise necessary to add to the output of a query is a function of four things:\n", - "\n", - "- the type of noise (Gaussian/Laplacian)\n", - "- the sensitivity of the query/function\n", - "- the desired epsilon (ε)\n", - "- the desired delta (δ)\n", - "\n", - "Thus, for each type of noise we're adding, we have different way of calculating how much to add as a function of sensitivity, epsilon, and delta. We're going to focus on Laplacian noise. Laplacian noise is increased/decreased according to a \"scale\" parameter b. We choose \"b\" based on the following formula.\n", - "\n", - "b = sensitivity(query) / epsilon\n", - "\n", - "In other words, if we set b to be this value, then we know that we will have a privacy leakage of <= epsilon. Furthermore, the nice thing about Laplace is that it guarantees this with delta == 0. There are some tunings where we can have very low epsilon where delta is non-zero, but we'll ignore them for now.\n", - "\n", - "### Querying Repeatedly\n", - "\n", - "- if we query the database multiple times - we can simply add the epsilons (Even if we change the amount of noise and their epsilons are not the same)." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "kWXSly0zNrTK", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "n1J2kT3SNrTN", - "colab_type": "text" - }, - "source": [ - "# Project: Create a Differentially Private Query\n", - "\n", - "In this project, I want you to take what you learned in the previous lesson and create a query function which sums over the database and adds just the right amount of noise such that it satisfies an epsilon constraint. Write a query for both \"sum\" and for \"mean\". Ensure that you use the correct sensitivity measures for both." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "AjtzGiLBNrTN", - "colab_type": "code", - "colab": {} - }, - "source": [ - "# try this project here!" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "lznEr0BONrTQ", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "Goc0emOKNrTT", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "3gWoxTewNrTd", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "o1kv4rPINrTj", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "Lko_2XOUNrTn", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "Ij6Zh3WUNrTr", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "b3p0xznUNrTy", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "S-yiuwr3NrT5", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "bv2Z7i8bNrT-", - "colab_type": "text" - }, - "source": [ - "# Lesson: Differential Privacy for Deep Learning\n", - "\n", - "So in the last lessons you may have been wondering - what does all of this have to do with Deep Learning? Well, these same techniques we were just studying form the core primitives for how Differential Privacy provides guarantees in the context of Deep Learning. \n", - "\n", - "Previously, we defined perfect privacy as \"a query to a database returns the same value even if we remove any person from the database\", and used this intuition in the description of epsilon/delta. In the context of deep learning we have a similar standard.\n", - "\n", - "Training a model on a dataset should return the same model even if we remove any person from the dataset.\n", - "\n", - "Thus, we've replaced \"querying a database\" with \"training a model on a dataset\". In essence, the training process is a kind of query. However, one should note that this adds two points of complexity which database queries did not have:\n", - "\n", - " 1. do we always know where \"people\" are referenced in the dataset?\n", - " 2. neural models rarely never train to the same output model, even on identical data\n", - "\n", - "The answer to (1) is to treat each training example as a single, separate person. Strictly speaking, this is often overly zealous as some training examples have no relevance to people and others may have multiple/partial (consider an image with multiple people contained within it). Thus, localizing exactly where \"people\" are referenced, and thus how much your model would change if people were removed, is challenging.\n", - "\n", - "The answer to (2) is also an open problem - but several interesitng proposals have been made. We're going to focus on one of the most popular proposals, PATE.\n", - "\n", - "## An Example Scenario: A Health Neural Network\n", - "\n", - "First we're going to consider a scenario - you work for a hospital and you have a large collection of images about your patients. However, you don't know what's in them. You would like to use these images to develop a neural network which can automatically classify them, however since your images aren't labeled, they aren't sufficient to train a classifier. \n", - "\n", - "However, being a cunning strategist, you realize that you can reach out to 10 partner hospitals which DO have annotated data. It is your hope to train your new classifier on their datasets so that you can automatically label your own. While these hospitals are interested in helping, they have privacy concerns regarding information about their patients. Thus, you will use the following technique to train a classifier which protects the privacy of patients in the other hospitals.\n", - "\n", - "- 1) You'll ask each of the 10 hospitals to train a model on their own datasets (All of which have the same kinds of labels)\n", - "- 2) You'll then use each of the 10 partner models to predict on your local dataset, generating 10 labels for each of your datapoints\n", - "- 3) Then, for each local data point (now with 10 labels), you will perform a DP query to generate the final true label. This query is a \"max\" function, where \"max\" is the most frequent label across the 10 labels. We will need to add laplacian noise to make this Differentially Private to a certain epsilon/delta constraint.\n", - "- 4) Finally, we will retrain a new model on our local dataset which now has labels. This will be our final \"DP\" model.\n", - "\n", - "So, let's walk through these steps. I will assume you're already familiar with how to train/predict a deep neural network, so we'll skip steps 1 and 2 and work with example data. We'll focus instead on step 3, namely how to perform the DP query for each example using toy data.\n", - "\n", - "So, let's say we have 10,000 training examples, and we've got 10 labels for each example (from our 10 \"teacher models\" which were trained directly on private data). Each label is chosen from a set of 10 possible labels (categories) for each image." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "30skLUk2NrUA", - "colab_type": "code", - "colab": {} - }, - "source": [ - "import numpy as np" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "q3tV7jL4NrUE", - "colab_type": "code", - "colab": {} - }, - "source": [ - "num_teachers = 10 # we're working with 10 partner hospitals\n", - "num_examples = 10000 # the size of OUR dataset\n", - "num_labels = 10 # number of lablels for our classifier" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "IaT_f1zKNrUJ", - "colab_type": "code", - "colab": {} - }, - "source": [ - "preds = (np.random.rand(num_teachers, num_examples) * num_labels).astype(int).transpose(1,0) # fake predictions" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "nt38zWhYNrUW", - "colab_type": "code", - "colab": {} - }, - "source": [ - "new_labels = list()\n", - "for an_image in preds:\n", - "\n", - " label_counts = np.bincount(an_image, minlength=num_labels)\n", - "\n", - " epsilon = 0.1\n", - " beta = 1 / epsilon\n", - "\n", - " for i in range(len(label_counts)):\n", - " label_counts[i] += np.random.laplace(0, beta, 1)\n", - "\n", - " new_label = np.argmax(label_counts)\n", - " \n", - " new_labels.append(new_label)" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "xc41XXHDNrUo", - "colab_type": "code", - "colab": {} - }, - "source": [ - "# new_labels" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "u52N4rs-NrU0", - "colab_type": "text" - }, - "source": [ - "# PATE Analysis" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "6-tQ3PAANrU9", - "colab_type": "code", - "colab": {} - }, - "source": [ - "labels = np.array([9, 9, 3, 6, 9, 9, 9, 9, 8, 2])\n", - "counts = np.bincount(labels, minlength=10)\n", - "query_result = np.argmax(counts)\n", - "query_result" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "G_Ui-1O3NrVA", - "colab_type": "code", - "colab": {} - }, - "source": [ - "from syft.frameworks.torch.differential_privacy import pate" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "enf7GqbBNrVF", - "colab_type": "code", - "colab": {} - }, - "source": [ - "num_teachers, num_examples, num_labels = (100, 100, 10)\n", - "preds = (np.random.rand(num_teachers, num_examples) * num_labels).astype(int) #fake preds\n", - "indices = (np.random.rand(num_examples) * num_labels).astype(int) # true answers\n", - "\n", - "preds[:,0:10] *= 0\n", - "\n", - "data_dep_eps, data_ind_eps = pate.perform_analysis(teacher_preds=preds, indices=indices, noise_eps=0.1, delta=1e-5)\n", - "\n", - "assert data_dep_eps < data_ind_eps\n", - "\n" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "rtcGSVu_NrVM", - "colab_type": "code", - "colab": {} - }, - "source": [ - "data_dep_eps, data_ind_eps = pate.perform_analysis(teacher_preds=preds, indices=indices, noise_eps=0.1, delta=1e-5)\n", - "print(\"Data Independent Epsilon:\", data_ind_eps)\n", - "print(\"Data Dependent Epsilon:\", data_dep_eps)" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "54GPud2ENrVU", - "colab_type": "code", - "colab": {} - }, - "source": [ - "preds[:,0:50] *= 0" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "fvxVNNEMNrVY", - "colab_type": "code", - "colab": {} - }, - "source": [ - "data_dep_eps, data_ind_eps = pate.perform_analysis(teacher_preds=preds, indices=indices, noise_eps=0.1, delta=1e-5, moments=20)\n", - "print(\"Data Independent Epsilon:\", data_ind_eps)\n", - "print(\"Data Dependent Epsilon:\", data_dep_eps)" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "RY-gAWL9NrVd", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "BmlFIVPaNrVk", - "colab_type": "text" - }, - "source": [ - "# Where to Go From Here\n", - "\n", - "\n", - "Read:\n", - " - Algorithmic Foundations of Differential Privacy: https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf\n", - " - Deep Learning with Differential Privacy: https://arxiv.org/pdf/1607.00133.pdf\n", - " - The Ethical Algorithm: https://www.amazon.com/Ethical-Algorithm-Science-Socially-Design/dp/0190948205\n", - " \n", - "Topics:\n", - " - The Exponential Mechanism\n", - " - The Moment's Accountant\n", - " - Differentially Private Stochastic Gradient Descent\n", - "\n", - "Advice:\n", - " - For deployments - stick with public frameworks!\n", - " - Join the Differential Privacy Community\n", - " - Don't get ahead of yourself - DP is still in the early days" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "wa9mH1R7NrVn", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "QhSsS_zsNrVz", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Q8GdrZB1NrV2", - "colab_type": "text" - }, - "source": [ - "# Section Project:\n", - "\n", - "For the final project for this section, you're going to train a DP model using this PATE method on the MNIST dataset, provided below." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "BsblHz0vNrV3", - "colab_type": "code", - "colab": {} - }, - "source": [ - "import torchvision.datasets as datasets\n", - "mnist_trainset = datasets.MNIST(root='./data', train=True, download=True, transform=None)" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "ocrWt2AtNrWC", - "colab_type": "code", - "colab": {} - }, - "source": [ - "train_data = mnist_trainset.train_data\n", - "train_targets = mnist_trainset.train_labels" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "31jgZwaNNrWL", - "colab_type": "code", - "colab": {} - }, - "source": [ - "test_data = mnist_trainset.test_data\n", - "test_targets = mnist_trainset.test_labels" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "6gC9xD2hNrWU", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - } - ] -} \ No newline at end of file From dbf63c2a30aebaf3adf268ec1d02c2f54d2dd72e Mon Sep 17 00:00:00 2001 From: Nimanshi Jha <40792134+nimanshijha@users.noreply.github.com> Date: Wed, 21 Aug 2019 17:34:45 +0530 Subject: [PATCH 04/14] Add files via upload --- Section_1_Differential_Privacy_(2) (1).ipynb | 2310 ++++++++++++++++++ 1 file changed, 2310 insertions(+) create mode 100644 Section_1_Differential_Privacy_(2) (1).ipynb diff --git a/Section_1_Differential_Privacy_(2) (1).ipynb b/Section_1_Differential_Privacy_(2) (1).ipynb new file mode 100644 index 0000000..a6f45ef --- /dev/null +++ b/Section_1_Differential_Privacy_(2) (1).ipynb @@ -0,0 +1,2310 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + }, + "colab": { + "name": "Section_1_Differential_Privacy (2).ipynb", + "version": "0.3.2", + "provenance": [] + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "5SmPdWKcNrO7", + "colab_type": "text" + }, + "source": [ + "## Lesson: Toy Differential Privacy - Simple Database Queries" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Gm1VK8ofNrO-", + "colab_type": "text" + }, + "source": [ + "In this section we're going to play around with Differential Privacy in the context of a database query. The database is going to be a VERY simple database with only one boolean column. Each row corresponds to a person. Each value corresponds to whether or not that person has a certain private attribute (such as whether they have a certain disease, or whether they are above/below a certain age). We are then going to learn how to know whether a database query over such a small database is differentially private or not - and more importantly - what techniques are at our disposal to ensure various levels of privacy\n", + "\n", + "\n", + "### First We Create a Simple Database\n", + "\n", + "Step one is to create our database - we're going to do this by initializing a random list of 1s and 0s (which are the entries in our database). Note - the number of entries directly corresponds to the number of people in our database." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "WGTsLkPNNrO_", + "colab_type": "code", + "outputId": "4199570c-be57-4acd-ea8c-756a2014a651", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "import torch\n", + "\n", + "# the number of entries in our database\n", + "num_entries = 5000\n", + "\n", + "db = torch.rand(num_entries) > 0.5\n", + "db" + ], + "execution_count": 1, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([0, 0, 1, ..., 1, 0, 0], dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 1 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nhJRhUDrNrPF", + "colab_type": "text" + }, + "source": [ + "## Project: Generate Parallel Databases\n", + "\n", + "Key to the definition of differenital privacy is the ability to ask the question \"When querying a database, if I removed someone from the database, would the output of the query be any different?\". Thus, in order to check this, we must construct what we term \"parallel databases\" which are simply databases with one entry removed. \n", + "\n", + "In this first project, I want you to create a list of every parallel database to the one currently contained in the \"db\" variable. Then, I want you to create a function which both:\n", + "\n", + "- creates the initial database (db)\n", + "- creates all parallel databases" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "JJBAPNGNNrPG", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import torch" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "QsZUO3xyNrPK", + "colab_type": "code", + "outputId": "cada842f-e1e3-4ded-eb1a-84d884cbcd2a", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 75 + } + }, + "source": [ + "!pip install torch\n" + ], + "execution_count": 3, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Requirement already satisfied: torch in /usr/local/lib/python3.6/dist-packages (1.1.0)\n", + "Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from torch) (1.16.4)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "aCs8swrFNrPO", + "colab_type": "code", + "outputId": "a9d48f44-3312-4b98-f141-a4c2a9de4bb4", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "db = torch.rand(num_entries) > 0.5\n", + "db" + ], + "execution_count": 4, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1, 1, 1, ..., 1, 0, 1], dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 4 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "IxKQz_BDNrPR", + "colab_type": "code", + "colab": {} + }, + "source": [ + "remove_index = 2" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Kz1iFm9FNrPU", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def get_parallel_db(db, remove_index) :\n", + " \n", + " return torch.cat((db[0:remove_index], db[remove_index+1:]))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "AFsIZVt2RFqC", + "colab_type": "code", + "outputId": "79c362b0-500c-4b88-b9b1-d2cfa6af6885", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "get_parallel_db(db, 52352) " + ], + "execution_count": 7, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1, 1, 1, ..., 1, 0, 1], dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 7 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "IOCQAp8RSZYZ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def get_parallel_dbs(db) :\n", + "\n", + " parallel_dbs = list()\n", + " \n", + " for i in range(len(db)) :\n", + " pdb = get_parallel_db(db, 1)\n", + " parallel_dbs.append(pdb)\n", + " \n", + " return parallel_dbs" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "7HKGj8DvSaWp", + "colab_type": "code", + "colab": {} + }, + "source": [ + "pdbs = get_parallel_dbs(db)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "1uAFmrb2S5XO", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def create_db_and_parallels(num_entries) :\n", + " \n", + " db = torch.rand(num_entries) > 0.5\n", + " pdbs = get_parallel_dbs(db)\n", + " \n", + " return db, pdbs" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "9IujNF8LS_XR", + "colab_type": "code", + "colab": {} + }, + "source": [ + "db, pdbs = create_db_and_parallels(20)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "rYG1LFeJU6K_", + "colab_type": "code", + "outputId": "ae11a27f-45b0-4d49-f0ec-c24a9625ac4f", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + } + }, + "source": [ + "db" + ], + "execution_count": 12, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 12 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "lpzJBeKgU913", + "colab_type": "code", + "outputId": "15c4a30d-1c00-455a-a48e-0158d805c1a3", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 777 + } + }, + "source": [ + "pdbs" + ], + "execution_count": 13, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "[tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8),\n", + " tensor([0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1],\n", + " dtype=torch.uint8)]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 13 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WNhfE_cJNrPY", + "colab_type": "text" + }, + "source": [ + "# Lesson: Towards Evaluating The Differential Privacy of a Function\n", + "\n", + "Intuitively, we want to be able to query our database and evaluate whether or not the result of the query is leaking \"private\" information. As mentioned previously, this is about evaluating whether the output of a query changes when we remove someone from the database. Specifically, we want to evaluate the *maximum* amount the query changes when someone is removed (maximum over all possible people who could be removed). So, in order to evaluate how much privacy is leaked, we're going to iterate over each person in the database and measure the difference in the output of the query relative to when we query the entire database. \n", + "\n", + "Just for the sake of argument, let's make our first \"database query\" a simple sum. Aka, we're going to count the number of 1s in the database." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "MaT4uxrtY9Nl", + "colab_type": "code", + "colab": {} + }, + "source": [ + "db, pdbs = create_db_and_parallels(5000)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "PjvnskWANrPe", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def query(db):\n", + " return db.sum()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "V1RneGR-NrPo", + "colab_type": "code", + "colab": {} + }, + "source": [ + "full_db_result = query(db)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "e8ZfuZ-6NrPu", + "colab_type": "code", + "colab": {} + }, + "source": [ + "sensitivity = 0\n", + "for pdb in pdbs:\n", + " pdb_result = query(pdb)\n", + " \n", + " db_distance = torch.abs(pdb_result - full_db_result)\n", + " \n", + " if(db_distance > sensitivity):\n", + " sensitivity = db_distance" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "ZbOjR2BxNrP2", + "colab_type": "code", + "outputId": "1c0832b5-8ad5-4d3e-c63c-d58d138cd7b6", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "sensitivity" + ], + "execution_count": 24, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "0" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 24 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nzNvdQE4NrP6", + "colab_type": "text" + }, + "source": [ + "# Project - Evaluating the Privacy of a Function\n", + "\n", + "In the last section, we measured the difference between each parallel db's query result and the query result for the entire database and then calculated the max value (which was 1). This value is called \"sensitivity\", and it corresponds to the function we chose for the query. Namely, the \"sum\" query will always have a sensitivity of exactly 1. However, we can also calculate sensitivity for other functions as well.\n", + "\n", + "Let's try to calculate sensitivity for the \"mean\" function." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "bfzPBl6xNrP7", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def sensitivity(query, n_entries=1000) :\n", + " \n", + " db, pdbs = create_db_and_parallels(n_entries)\n", + " \n", + " full_db_result = query(db)\n", + " \n", + " max_distance = 0\n", + " for pdb in pdbs :\n", + " pdb_result = query(pdb)\n", + " \n", + " db_distance = torch.abs(pdb_result - full_db_result)\n", + " \n", + " if(db_distance >max_distance) :\n", + " max_distance = db_distance\n", + " \n", + " \n", + " return max_distance" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "gforqWGZNrP-", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def query(db) :\n", + " return db.float().mean()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "qIKSIIoONrQE", + "colab_type": "code", + "outputId": "79635003-08a1-4d54-ab2b-3f5b1426bf8b", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "sensitivity(query)" + ], + "execution_count": 27, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor(0.0005)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 27 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "pDuXxPWDNrQI", + "colab_type": "code", + "colab": {} + }, + "source": [ + "db, pdbs = create_db_and_parallels(20)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "evC2zFNGNrQO", + "colab_type": "code", + "outputId": "ac88f324-5a4f-4e86-dd22-16c62080dff9", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + } + }, + "source": [ + "db" + ], + "execution_count": 29, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1],\n", + " dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 29 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Amb03X0LNrQo", + "colab_type": "text" + }, + "source": [ + "Wow! That sensitivity is WAY lower. Note the intuition here. \"Sensitivity\" is measuring how sensitive the output of the query is to a person being removed from the database. For a simple sum, this is always 1, but for the mean, removing a person is going to change the result of the query by rougly 1 divided by the size of the database (which is much smaller). Thus, \"mean\" is a VASTLY less \"sensitive\" function (query) than SUM." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ujLjfLfZNrQt", + "colab_type": "text" + }, + "source": [ + "# Project: Calculate L1 Sensitivity For Threshold\n", + "\n", + "In this first project, I want you to calculate the sensitivty for the \"threshold\" function. \n", + "\n", + "- First compute the sum over the database (i.e. sum(db)) and return whether that sum is greater than a certain threshold.\n", + "- Then, I want you to create databases of size 10 and threshold of 5 and calculate the sensitivity of the function. \n", + "- Finally, re-initialize the database 10 times and calculate the sensitivity each time." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "NdoBSl7UNrQv", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def query(db, threshold=5) :\n", + " return (db.sum() > threshold) . float()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "rkL2cW8INrQ_", + "colab_type": "code", + "outputId": "2d2d0804-fb1f-4cdf-d498-bcefea8b48bf", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 207 + } + }, + "source": [ + "for i in range(10) :\n", + " sens_f = sensitivity(query, n_entries=10)\n", + " print(sens_f)" + ], + "execution_count": 39, + "outputs": [ + { + "output_type": "stream", + "text": [ + "tensor(1.)\n", + "tensor(1.)\n", + "0\n", + "0\n", + "0\n", + "0\n", + "0\n", + "0\n", + "0\n", + "0\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "EYDeItdmNrRP", + "colab_type": "text" + }, + "source": [ + "# Lesson: A Basic Differencing Attack\n", + "\n", + "Sadly none of the functions we've looked at so far are differentially private (despite them having varying levels of sensitivity). The most basic type of attack can be done as follows.\n", + "\n", + "Let's say we wanted to figure out a specific person's value in the database. All we would have to do is query for the sum of the entire database and then the sum of the entire database without that person!\n", + "\n", + "# Project: Perform a Differencing Attack on Row 10\n", + "\n", + "In this project, I want you to construct a database and then demonstrate how you can use two different sum queries to explose the value of the person represented by row 10 in the database (note, you'll need to use a database with at least 10 rows)" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "b7iBppCDNrRQ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "db, _ = create_db_and_parallels(100)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "-zt1hTwvNrRU", + "colab_type": "code", + "colab": {} + }, + "source": [ + "pdb = get_parallel_db(db, remove_index=10)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "_8J6Bt3ONrRX", + "colab_type": "code", + "outputId": "20a49eca-3476-421a-d00e-be77e18129bd", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "db[10]" + ], + "execution_count": 42, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor(0, dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 42 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "VM8ZkQ62NrRb", + "colab_type": "code", + "outputId": "31fe070b-50d8-4653-a6ef-33c00651bf7d", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "sum(db)" + ], + "execution_count": 43, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor(59, dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 43 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "R11dyl7gNrRe", + "colab_type": "code", + "outputId": "fc94e418-2407-46c2-cc46-f3cf32125123", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "sum(db) - sum(pdb)" + ], + "execution_count": 44, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor(0, dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 44 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "bw_5CeM6NrRi", + "colab_type": "code", + "outputId": "e656a953-cfdb-4932-95c1-552bbbf61f93", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "(sum(db) . float() / len(db)) - (sum(pdb).float() / len(pdb))" + ], + "execution_count": 45, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor(-0.0060)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 45 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "UpgIeau6NrRm", + "colab_type": "code", + "outputId": "98f262c4-33fe-43f0-b553-c1dd1b18cd39", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "(sum(db).float() > 49) - (sum(pdb).float() > 49) " + ], + "execution_count": 46, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor(0, dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 46 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fg6FPpcnNrRx", + "colab_type": "text" + }, + "source": [ + "# Project: Local Differential Privacy\n", + "\n", + "As you can see, the basic sum query is not differentially private at all! In truth, differential privacy always requires a form of randomness added to the query. Let me show you what I mean.\n", + "\n", + "### Randomized Response (Local Differential Privacy)\n", + "\n", + "Let's say I have a group of people I wish to survey about a very taboo behavior which I think they will lie about (say, I want to know if they have ever committed a certain kind of crime). I'm not a policeman, I'm just trying to collect statistics to understand the higher level trend in society. So, how do we do this? One technique is to add randomness to each person's response by giving each person the following instructions (assuming I'm asking a simple yes/no question):\n", + "\n", + "- Flip a coin 2 times.\n", + "- If the first coin flip is heads, answer honestly\n", + "- If the first coin flip is tails, answer according to the second coin flip (heads for yes, tails for no)!\n", + "\n", + "Thus, each person is now protected with \"plausible deniability\". If they answer \"Yes\" to the question \"have you committed X crime?\", then it might becasue they actually did, or it might be becasue they are answering according to a random coin flip. Each person has a high degree of protection. Furthermore, we can recover the underlying statistics with some accuracy, as the \"true statistics\" are simply averaged with a 50% probability. Thus, if we collect a bunch of samples and it turns out that 60% of people answer yes, then we know that the TRUE distribution is actually centered around 70%, because 70% averaged wtih 50% (a coin flip) is 60% which is the result we obtained. \n", + "\n", + "However, it should be noted that, especially when we only have a few samples, this comes at the cost of accuracy. This tradeoff exists across all of Differential Privacy. The greater the privacy protection (plausible deniability) the less accurate the results. \n", + "\n", + "Let's implement this local DP for our database before!" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "k5IV99F8NrRy", + "colab_type": "code", + "colab": {} + }, + "source": [ + "db, pdbs = create_db_and_parallels(100)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "x_DCCq81NrR5", + "colab_type": "code", + "outputId": "2c57e9df-d48a-4bdd-bbcf-eaf45a12f856", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 112 + } + }, + "source": [ + "db" + ], + "execution_count": 48, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0,\n", + " 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1,\n", + " 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0,\n", + " 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0,\n", + " 0, 0, 1, 0], dtype=torch.uint8)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 48 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "3kTgbGEnNrR9", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def query(db) :\n", + " \n", + " true_result = torch.mean(db.float())\n", + "\n", + " first_coin_flip = (torch.rand(len(db)) > 0.5).float()\n", + " second_coin_flip = (torch.rand(len(db)) > 0.5).float()\n", + " \n", + " augmented_db = (db.float() * first_coin_flip) + ((1-first_coin_flip) * second_coin_flip)\n", + " \n", + " db_result = torch.mean(augmented_db.float()) * 2 - 0.5\n", + " \n", + " return db_result, true_result" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "h3gQASKuNrSB", + "colab_type": "code", + "outputId": "5501a55d-4a7f-4925-b2a9-612a8d12b15c", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + } + }, + "source": [ + "db, pdbs = create_db_and_parallels(10)\n", + "private_result, true_result = query(db)\n", + "print(\"With Noise\" + str(private_result))\n", + "print(\"Without Noise\" + str(true_result))" + ], + "execution_count": 50, + "outputs": [ + { + "output_type": "stream", + "text": [ + "With Noisetensor(0.1000)\n", + "Without Noisetensor(0.2000)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "UGo0wEFKNrSD", + "colab_type": "code", + "outputId": "76362031-162c-4f65-853a-5bd3870d18f4", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + } + }, + "source": [ + "db, pdbs = create_db_and_parallels(100)\n", + "private_result, true_result = query(db)\n", + "print(\"With Noise\" + str(private_result))\n", + "print(\"Without Noise\" + str(true_result))" + ], + "execution_count": 51, + "outputs": [ + { + "output_type": "stream", + "text": [ + "With Noisetensor(0.3200)\n", + "Without Noisetensor(0.4600)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "MyYXDeDhNrSH", + "colab_type": "code", + "outputId": "e10af884-ff68-4bbe-95a0-ffaf9f4053d0", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + } + }, + "source": [ + "db, pdbs = create_db_and_parallels(1000)\n", + "private_result, true_result = query(db)\n", + "print(\"With Noise\" + str(private_result))\n", + "print(\"Without Noise\" + str(true_result))" + ], + "execution_count": 52, + "outputs": [ + { + "output_type": "stream", + "text": [ + "With Noisetensor(0.4860)\n", + "Without Noisetensor(0.5030)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "n9EXjcX-NrSL", + "colab_type": "code", + "outputId": "c9b6cbe8-9a79-428b-ace8-ad86dd348015", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + } + }, + "source": [ + "db, pdbs = create_db_and_parallels(10000)\n", + "private_result, true_result = query(db)\n", + "print(\"With Noise\" + str(private_result))\n", + "print(\"Without Noise\" + str(true_result))" + ], + "execution_count": 53, + "outputs": [ + { + "output_type": "stream", + "text": [ + "With Noisetensor(0.5232)\n", + "Without Noisetensor(0.5087)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "cOm1aBq8NrST", + "colab_type": "text" + }, + "source": [ + "# Project: Varying Amounts of Noise\n", + "\n", + "In this project, I want you to augment the randomized response query (the one we just wrote) to allow for varying amounts of randomness to be added. Specifically, I want you to bias the coin flip to be higher or lower and then run the same experiment. \n", + "\n", + "Note - this one is a bit tricker than you might expect. You need to both adjust the likelihood of the first coin flip AND the de-skewing at the end (where we create the \"augmented_result\" variable)." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "AUKLRajhNrSV", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def query(db, noice=0.2) :\n", + " \n", + " true_result = torch.mean(db.float())\n", + "\n", + " first_coin_flip = (torch.rand(len(db)) > noice).float()\n", + " second_coin_flip = (torch.rand(len(db)) > 0.5).float()\n", + " \n", + " augmented_db = (db.float() * first_coin_flip) + ((1-first_coin_flip) * second_coin_flip)\n", + " \n", + " sk_result = augmented_db.float().mean()\n", + " \n", + " private_result = ((sk_result / noice ) - 0.5) * noice / (1-noice)\n", + " \n", + " return private_result, true_result" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "R_XDE2JqxwRA", + "colab_type": "code", + "outputId": "8d8b8b6f-ebbd-48c1-deea-5079759b899b", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + } + }, + "source": [ + "db, pdbs = create_db_and_parallels(100)\n", + "private_result, true_result = query(db, noice= 0.1)\n", + "print(\"With Noise\" + str(private_result))\n", + "print(\"Without Noise\" + str(true_result))" + ], + "execution_count": 57, + "outputs": [ + { + "output_type": "stream", + "text": [ + "With Noisetensor(0.4556)\n", + "Without Noisetensor(0.4900)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "7uAunnot6SPH", + "colab_type": "code", + "outputId": "42e8e657-5229-45f2-e1a1-37dd2bd753a8", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + } + }, + "source": [ + "db, pdbs = create_db_and_parallels(100)\n", + "private_result, true_result = query(db, noice= 0.2)\n", + "print(\"With Noise\" + str(private_result))\n", + "print(\"Without Noise\" + str(true_result))" + ], + "execution_count": 58, + "outputs": [ + { + "output_type": "stream", + "text": [ + "With Noisetensor(0.5000)\n", + "Without Noisetensor(0.5300)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "aTtH-Ics6RzQ", + "colab_type": "code", + "outputId": "f03999e8-f016-4cd6-b005-a48b6c9c8d17", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + } + }, + "source": [ + "db, pdbs = create_db_and_parallels(100)\n", + "private_result, true_result = query(db, noice= 0.4)\n", + "print(\"With Noise\" + str(private_result))\n", + "print(\"Without Noise\" + str(true_result))" + ], + "execution_count": 59, + "outputs": [ + { + "output_type": "stream", + "text": [ + "With Noisetensor(0.6500)\n", + "Without Noisetensor(0.6400)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "-kgGHfzw6cml", + "colab_type": "code", + "outputId": "46f47a6e-1264-4c6e-e553-0428444d57ce", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + } + }, + "source": [ + "db, pdbs = create_db_and_parallels(10000)\n", + "private_result, true_result = query(db, noice= 0.8)\n", + "print(\"With Noise\" + str(private_result))\n", + "print(\"Without Noise\" + str(true_result))" + ], + "execution_count": 60, + "outputs": [ + { + "output_type": "stream", + "text": [ + "With Noisetensor(0.5155)\n", + "Without Noisetensor(0.5073)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "DPTScgmQNrSX", + "colab_type": "code", + "outputId": "7351913d-16cb-44db-9429-c10c1854b753", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "torch.rand(len(db))" + ], + "execution_count": 61, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([0.1358, 0.7498, 0.6210, ..., 0.4323, 0.3016, 0.4702])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 61 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "7cLnh8ryNrSZ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "noice = 0.5\n", + "true_dist_mean = 0.7\n", + "noice_dist_mean = 0.5" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "bOTnIAQSNrSe", + "colab_type": "code", + "colab": {} + }, + "source": [ + "augmented_db_mean = ((true_dist_mean * noice) + (noice_dist_mean * (1-noice)))\n" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "CmxohDVBNrSn", + "colab_type": "code", + "outputId": "b050d533-8dbd-4720-fbe9-2e3dc938f941", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "augmented_db_mean" + ], + "execution_count": 64, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "0.6" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 64 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FRqBEo82NrTE", + "colab_type": "text" + }, + "source": [ + "# Lesson: The Formal Definition of Differential Privacy\n", + "\n", + "The previous method of adding noise was called \"Local Differentail Privacy\" because we added noise to each datapoint individually. This is necessary for some situations wherein the data is SO sensitive that individuals do not trust noise to be added later. However, it comes at a very high cost in terms of accuracy. \n", + "\n", + "However, alternatively we can add noise AFTER data has been aggregated by a function. This kind of noise can allow for similar levels of protection with a lower affect on accuracy. However, participants must be able to trust that no-one looked at their datapoints _before_ the aggregation took place. In some situations this works out well, in others (such as an individual hand-surveying a group of people), this is less realistic.\n", + "\n", + "Nevertheless, global differential privacy is incredibly important because it allows us to perform differential privacy on smaller groups of individuals with lower amounts of noise. Let's revisit our sum functions." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "nQwYWNZcNrTE", + "colab_type": "code", + "outputId": "fef4233b-ee6c-4bf0-fdd1-869930cfa657", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "db, pdbs = create_db_and_parallels(100)\n", + "\n", + "def query(db):\n", + " return torch.sum(db.float())\n", + "\n", + "def M(db):\n", + " query(db) + noise\n", + "\n", + "query(db)" + ], + "execution_count": 65, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor(48.)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 65 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6Qvj7T79NrTH", + "colab_type": "text" + }, + "source": [ + "So the idea here is that we want to add noise to the output of our function. We actually have two different kinds of noise we can add - Laplacian Noise or Gaussian Noise. However, before we do so at this point we need to dive into the formal definition of Differential Privacy.\n", + "\n", + "![alt text](dp_formula.png \"Title\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9Gh9nZDhNrTI", + "colab_type": "text" + }, + "source": [ + "_Image From: \"The Algorithmic Foundations of Differential Privacy\" - Cynthia Dwork and Aaron Roth - https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf_" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "D73LS5IdNrTJ", + "colab_type": "text" + }, + "source": [ + "This definition does not _create_ differential privacy, instead it is a measure of how much privacy is afforded by a query M. Specifically, it's a comparison between running the query M on a database (x) and a parallel database (y). As you remember, parallel databases are defined to be the same as a full database (x) with one entry/person removed.\n", + "\n", + "Thus, this definition says that FOR ALL parallel databases, the maximum distance between a query on database (x) and the same query on database (y) will be e^epsilon, but that occasionally this constraint won't hold with probability delta. Thus, this theorem is called \"epsilon delta\" differential privacy.\n", + "\n", + "# Epsilon\n", + "\n", + "Let's unpack the intuition of this for a moment. \n", + "\n", + "Epsilon Zero: If a query satisfied this inequality where epsilon was set to 0, then that would mean that the query for all parallel databases outputed the exact same value as the full database. As you may remember, when we calculated the \"threshold\" function, often the Sensitivity was 0. In that case, the epsilon also happened to be zero.\n", + "\n", + "Epsilon One: If a query satisfied this inequality with epsilon 1, then the maximum distance between all queries would be 1 - or more precisely - the maximum distance between the two random distributions M(x) and M(y) is 1 (because all these queries have some amount of randomness in them, just like we observed in the last section).\n", + "\n", + "# Delta\n", + "\n", + "Delta is basically the probability that epsilon breaks. Namely, sometimes the epsilon is different for some queries than it is for others. For example, you may remember when we were calculating the sensitivity of threshold, most of the time sensitivity was 0 but sometimes it was 1. Thus, we could calculate this as \"epsilon zero but non-zero delta\" which would say that epsilon is perfect except for some probability of the time when it's arbitrarily higher. Note that this expression doesn't represent the full tradeoff between epsilon and delta." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1CH6FPP0NrTK", + "colab_type": "text" + }, + "source": [ + "# Lesson: How To Add Noise for Global Differential Privacy\n", + "\n", + "In this lesson, we're going to learn about how to take a query and add varying amounts of noise so that it satisfies a certain degree of differential privacy. In particular, we're going to leave behind the Local Differential privacy previously discussed and instead opt to focus on Global differential privacy. \n", + "\n", + "So, to sum up, this lesson is about adding noise to the output of our query so that it satisfies a certain epsilon-delta differential privacy threshold.\n", + "\n", + "There are two kinds of noise we can add - Gaussian Noise or Laplacian Noise. Generally speaking Laplacian is better, but both are still valid. Now to the hard question...\n", + "\n", + "### How much noise should we add?\n", + "\n", + "The amount of noise necessary to add to the output of a query is a function of four things:\n", + "\n", + "- the type of noise (Gaussian/Laplacian)\n", + "- the sensitivity of the query/function\n", + "- the desired epsilon (ε)\n", + "- the desired delta (δ)\n", + "\n", + "Thus, for each type of noise we're adding, we have different way of calculating how much to add as a function of sensitivity, epsilon, and delta. We're going to focus on Laplacian noise. Laplacian noise is increased/decreased according to a \"scale\" parameter b. We choose \"b\" based on the following formula.\n", + "\n", + "b = sensitivity(query) / epsilon\n", + "\n", + "In other words, if we set b to be this value, then we know that we will have a privacy leakage of <= epsilon. Furthermore, the nice thing about Laplace is that it guarantees this with delta == 0. There are some tunings where we can have very low epsilon where delta is non-zero, but we'll ignore them for now.\n", + "\n", + "### Querying Repeatedly\n", + "\n", + "- if we query the database multiple times - we can simply add the epsilons (Even if we change the amount of noise and their epsilons are not the same)." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "kWXSly0zNrTK", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "n1J2kT3SNrTN", + "colab_type": "text" + }, + "source": [ + "# Project: Create a Differentially Private Query\n", + "\n", + "In this project, I want you to take what you learned in the previous lesson and create a query function which sums over the database and adds just the right amount of noise such that it satisfies an epsilon constraint. Write a query for both \"sum\" and for \"mean\". Ensure that you use the correct sensitivity measures for both." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "AjtzGiLBNrTN", + "colab_type": "code", + "colab": {} + }, + "source": [ + "epsilon = 0.0001" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "lznEr0BONrTQ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import numpy as np" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Goc0emOKNrTT", + "colab_type": "code", + "colab": {} + }, + "source": [ + "db, pdbs = create_db_and_parallels(100)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "3gWoxTewNrTd", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def sum_query(db) :\n", + " return db.sum()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "o1kv4rPINrTj", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def laplacian_mech(db, query, sensitivity) :\n", + " \n", + " beta = sensitivity / epsilon\n", + " noice = torch.tensor(np.random.laplace(0, beta, 1))\n", + " \n", + " return query(db) + noice" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Lko_2XOUNrTn", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def mean_query(db) :\n", + " return torch.mean(db.float())" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Ij6Zh3WUNrTr", + "colab_type": "code", + "outputId": "7fe2bd04-4698-4292-8fc2-2eb8637aead1", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "laplacian_mech(db, sum_query, 1) \n" + ], + "execution_count": 72, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([-4842.6141], dtype=torch.float64)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 72 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "b3p0xznUNrTy", + "colab_type": "code", + "outputId": "aaf0b1db-dde0-4580-fa62-f7de1beb1288", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + " laplacian_mech(db, mean_query, 1/100)" + ], + "execution_count": 73, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([-38.7372], dtype=torch.float64)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 73 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "bv2Z7i8bNrT-", + "colab_type": "text" + }, + "source": [ + "# Lesson: Differential Privacy for Deep Learning\n", + "\n", + "So in the last lessons you may have been wondering - what does all of this have to do with Deep Learning? Well, these same techniques we were just studying form the core primitives for how Differential Privacy provides guarantees in the context of Deep Learning. \n", + "\n", + "Previously, we defined perfect privacy as \"a query to a database returns the same value even if we remove any person from the database\", and used this intuition in the description of epsilon/delta. In the context of deep learning we have a similar standard.\n", + "\n", + "Training a model on a dataset should return the same model even if we remove any person from the dataset.\n", + "\n", + "Thus, we've replaced \"querying a database\" with \"training a model on a dataset\". In essence, the training process is a kind of query. However, one should note that this adds two points of complexity which database queries did not have:\n", + "\n", + " 1. do we always know where \"people\" are referenced in the dataset?\n", + " 2. neural models rarely never train to the same output model, even on identical data\n", + "\n", + "The answer to (1) is to treat each training example as a single, separate person. Strictly speaking, this is often overly zealous as some training examples have no relevance to people and others may have multiple/partial (consider an image with multiple people contained within it). Thus, localizing exactly where \"people\" are referenced, and thus how much your model would change if people were removed, is challenging.\n", + "\n", + "The answer to (2) is also an open problem - but several interesitng proposals have been made. We're going to focus on one of the most popular proposals, PATE.\n", + "\n", + "## An Example Scenario: A Health Neural Network\n", + "\n", + "First we're going to consider a scenario - you work for a hospital and you have a large collection of images about your patients. However, you don't know what's in them. You would like to use these images to develop a neural network which can automatically classify them, however since your images aren't labeled, they aren't sufficient to train a classifier. \n", + "\n", + "However, being a cunning strategist, you realize that you can reach out to 10 partner hospitals which DO have annotated data. It is your hope to train your new classifier on their datasets so that you can automatically label your own. While these hospitals are interested in helping, they have privacy concerns regarding information about their patients. Thus, you will use the following technique to train a classifier which protects the privacy of patients in the other hospitals.\n", + "\n", + "- 1) You'll ask each of the 10 hospitals to train a model on their own datasets (All of which have the same kinds of labels)\n", + "- 2) You'll then use each of the 10 partner models to predict on your local dataset, generating 10 labels for each of your datapoints\n", + "- 3) Then, for each local data point (now with 10 labels), you will perform a DP query to generate the final true label. This query is a \"max\" function, where \"max\" is the most frequent label across the 10 labels. We will need to add laplacian noise to make this Differentially Private to a certain epsilon/delta constraint.\n", + "- 4) Finally, we will retrain a new model on our local dataset which now has labels. This will be our final \"DP\" model.\n", + "\n", + "So, let's walk through these steps. I will assume you're already familiar with how to train/predict a deep neural network, so we'll skip steps 1 and 2 and work with example data. We'll focus instead on step 3, namely how to perform the DP query for each example using toy data.\n", + "\n", + "So, let's say we have 10,000 training examples, and we've got 10 labels for each example (from our 10 \"teacher models\" which were trained directly on private data). Each label is chosen from a set of 10 possible labels (categories) for each image." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "30skLUk2NrUA", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import numpy as np" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "q3tV7jL4NrUE", + "colab_type": "code", + "colab": {} + }, + "source": [ + "num_teachers = 10 # we're working with 10 partner hospitals\n", + "num_examples = 10000 # the size of OUR dataset\n", + "num_labels = 10 # number of lablels for our classifier" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "IaT_f1zKNrUJ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "preds = (np.random.rand(num_teachers, num_examples) * num_labels).astype(int).transpose(1,0) # fake predictions" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "nt38zWhYNrUW", + "colab_type": "code", + "colab": {} + }, + "source": [ + "new_labels = list()\n", + "for an_image in preds:\n", + "\n", + " label_counts = np.bincount(an_image, minlength=num_labels)\n", + "\n", + " epsilon = 0.1\n", + " beta = 1 / epsilon\n", + "\n", + " for i in range(len(label_counts)):\n", + " label_counts[i] += np.random.laplace(0, beta, 1)\n", + "\n", + " new_label = np.argmax(label_counts)\n", + " \n", + " new_labels.append(new_label)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "xc41XXHDNrUo", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# new_labels" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "u52N4rs-NrU0", + "colab_type": "text" + }, + "source": [ + "# PATE Analysis" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "6-tQ3PAANrU9", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "f2ace11a-3dbf-449f-b208-87de1f6b620c" + }, + "source": [ + "labels = np.array([9, 9, 3, 6, 9, 9, 9, 9, 8, 2])\n", + "counts = np.bincount(labels, minlength=10)\n", + "query_result = np.argmax(counts)\n", + "query_result" + ], + "execution_count": 78, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "9" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 78 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "J6Uw_hHXAoeH", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 1000 + }, + "outputId": "5dd8a088-09d3-41c4-e667-fe65d217e174" + }, + "source": [ + "pip install syft" + ], + "execution_count": 81, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Collecting syft\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/5b/25/633ddb891b3c4927bd03311a04ece038387faecb46120b8429ed28c72c13/syft-0.1.23a1-py3-none-any.whl (251kB)\n", + "\u001b[K |████████████████████████████████| 256kB 5.0MB/s \n", + "\u001b[?25hCollecting websocket-client>=0.56.0 (from syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/29/19/44753eab1fdb50770ac69605527e8859468f3c0fd7dc5a76dd9c4dbd7906/websocket_client-0.56.0-py2.py3-none-any.whl (200kB)\n", + "\u001b[K |████████████████████████████████| 204kB 41.6MB/s \n", + "\u001b[?25hCollecting flask-socketio>=3.3.2 (from syft)\n", + " Downloading https://files.pythonhosted.org/packages/66/44/edc4715af85671b943c18ac8345d0207972284a0cd630126ff5251faa08b/Flask_SocketIO-4.2.1-py2.py3-none-any.whl\n", + "Collecting websockets>=7.0 (from syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/f0/4b/ad228451b1c071c5c52616b7d4298ebcfcac5ae8515ede959db19e4cd56d/websockets-8.0.2-cp36-cp36m-manylinux1_x86_64.whl (72kB)\n", + "\u001b[K |████████████████████████████████| 81kB 23.5MB/s \n", + "\u001b[?25hCollecting tf-encrypted!=0.5.7,>=0.5.4 (from syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/1f/82/cf15aeac92525da2f794956712e7ebf418819390dec783430ee242b52d0b/tf_encrypted-0.5.8-py3-none-manylinux1_x86_64.whl (2.1MB)\n", + "\u001b[K |████████████████████████████████| 2.1MB 45.4MB/s \n", + "\u001b[?25hRequirement already satisfied: torchvision==0.3.0 in /usr/local/lib/python3.6/dist-packages (from syft) (0.3.0)\n", + "Requirement already satisfied: numpy>=1.14.0 in /usr/local/lib/python3.6/dist-packages (from syft) (1.16.4)\n", + "Collecting zstd>=1.4.0.0 (from syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/22/37/6a7ba746ebddbd6cd06de84367515d6bc239acd94fb3e0b1c85788176ca2/zstd-1.4.1.0.tar.gz (454kB)\n", + "\u001b[K |████████████████████████████████| 460kB 37.6MB/s \n", + "\u001b[?25hRequirement already satisfied: torch==1.1 in /usr/local/lib/python3.6/dist-packages (from syft) (1.1.0)\n", + "Requirement already satisfied: Flask>=1.0.2 in /usr/local/lib/python3.6/dist-packages (from syft) (1.1.1)\n", + "Collecting lz4>=2.1.6 (from syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/0a/c6/96bbb3525a63ebc53ea700cc7d37ab9045542d33b4d262d0f0408ad9bbf2/lz4-2.1.10-cp36-cp36m-manylinux1_x86_64.whl (385kB)\n", + "\u001b[K |████████████████████████████████| 389kB 40.3MB/s \n", + "\u001b[?25hRequirement already satisfied: scikit-learn>=0.21.0 in /usr/local/lib/python3.6/dist-packages (from syft) (0.21.3)\n", + "Collecting msgpack>=0.6.1 (from syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/92/7e/ae9e91c1bb8d846efafd1f353476e3fd7309778b582d2fb4cea4cc15b9a2/msgpack-0.6.1-cp36-cp36m-manylinux1_x86_64.whl (248kB)\n", + "\u001b[K |████████████████████████████████| 256kB 42.5MB/s \n", + "\u001b[?25hRequirement already satisfied: tblib>=1.4.0 in /usr/local/lib/python3.6/dist-packages (from syft) (1.4.0)\n", + "Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from websocket-client>=0.56.0->syft) (1.12.0)\n", + "Collecting python-socketio>=4.3.0 (from flask-socketio>=3.3.2->syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/35/b0/22c3f785f23fec5c7a815f47c55d7e7946a67ae2129ff604148e939d3bdb/python_socketio-4.3.1-py2.py3-none-any.whl (49kB)\n", + "\u001b[K |████████████████████████████████| 51kB 17.3MB/s \n", + "\u001b[?25hCollecting pyyaml>=5.1 (from tf-encrypted!=0.5.7,>=0.5.4->syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.2.tar.gz (265kB)\n", + "\u001b[K |████████████████████████████████| 266kB 43.3MB/s \n", + "\u001b[?25hRequirement already satisfied: tensorflow<2,>=1.12.0 in /usr/local/lib/python3.6/dist-packages (from tf-encrypted!=0.5.7,>=0.5.4->syft) (1.14.0)\n", + "Requirement already satisfied: pillow>=4.1.1 in /usr/local/lib/python3.6/dist-packages (from torchvision==0.3.0->syft) (4.3.0)\n", + "Requirement already satisfied: itsdangerous>=0.24 in /usr/local/lib/python3.6/dist-packages (from Flask>=1.0.2->syft) (1.1.0)\n", + "Requirement already satisfied: Jinja2>=2.10.1 in /usr/local/lib/python3.6/dist-packages (from Flask>=1.0.2->syft) (2.10.1)\n", + "Requirement already satisfied: Werkzeug>=0.15 in /usr/local/lib/python3.6/dist-packages (from Flask>=1.0.2->syft) (0.15.5)\n", + "Requirement already satisfied: click>=5.1 in /usr/local/lib/python3.6/dist-packages (from Flask>=1.0.2->syft) (7.0)\n", + "Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.6/dist-packages (from scikit-learn>=0.21.0->syft) (0.13.2)\n", + "Requirement already satisfied: scipy>=0.17.0 in /usr/local/lib/python3.6/dist-packages (from scikit-learn>=0.21.0->syft) (1.3.1)\n", + "Collecting python-engineio>=3.9.0 (from python-socketio>=4.3.0->flask-socketio>=3.3.2->syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/2b/20/8e3ba16102ae2e245d70d9cb9fa48b076253fdb036dc43eea142294c2897/python_engineio-3.9.3-py2.py3-none-any.whl (119kB)\n", + "\u001b[K |████████████████████████████████| 122kB 41.9MB/s \n", + "\u001b[?25hRequirement already satisfied: tensorflow-estimator<1.15.0rc0,>=1.14.0rc0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.14.0)\n", + "Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.15.0)\n", + "Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.33.4)\n", + "Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.1.0)\n", + "Requirement already satisfied: tensorboard<1.15.0,>=1.14.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.14.0)\n", + "Requirement already satisfied: wrapt>=1.11.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.11.2)\n", + "Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.1.0)\n", + "Requirement already satisfied: gast>=0.2.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.2.2)\n", + "Requirement already satisfied: astor>=0.6.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.8.0)\n", + "Requirement already satisfied: google-pasta>=0.1.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.1.7)\n", + "Requirement already satisfied: absl-py>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.7.1)\n", + "Requirement already satisfied: keras-applications>=1.0.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.0.8)\n", + "Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (3.7.1)\n", + "Requirement already satisfied: olefile in /usr/local/lib/python3.6/dist-packages (from pillow>=4.1.1->torchvision==0.3.0->syft) (0.46)\n", + "Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.6/dist-packages (from Jinja2>=2.10.1->Flask>=1.0.2->syft) (1.1.1)\n", + "Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (3.1.1)\n", + "Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (41.0.1)\n", + "Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from keras-applications>=1.0.6->tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (2.8.0)\n", + "Building wheels for collected packages: zstd, pyyaml\n", + " Building wheel for zstd (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + " Created wheel for zstd: filename=zstd-1.4.1.0-cp36-cp36m-linux_x86_64.whl size=1067082 sha256=58735f741486e2d16b216afa2fcb3d91ce2257b25a14a3c45b37e82caeca0257\n", + " Stored in directory: /root/.cache/pip/wheels/66/3f/ee/ac08c81af7c1b24a80c746df669ea3cb37542d27877d66ccf4\n", + " Building wheel for pyyaml (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + " Created wheel for pyyaml: filename=PyYAML-5.1.2-cp36-cp36m-linux_x86_64.whl size=44105 sha256=97e1364e316e4a4cbfe3404f8aefbbe300407d2c47e27efa84ae12dd31a904ee\n", + " Stored in directory: /root/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030\n", + "Successfully built zstd pyyaml\n", + "Installing collected packages: websocket-client, python-engineio, python-socketio, flask-socketio, websockets, pyyaml, tf-encrypted, zstd, lz4, msgpack, syft\n", + " Found existing installation: PyYAML 3.13\n", + " Uninstalling PyYAML-3.13:\n", + " Successfully uninstalled PyYAML-3.13\n", + " Found existing installation: msgpack 0.5.6\n", + " Uninstalling msgpack-0.5.6:\n", + " Successfully uninstalled msgpack-0.5.6\n", + "Successfully installed flask-socketio-4.2.1 lz4-2.1.10 msgpack-0.6.1 python-engineio-3.9.3 python-socketio-4.3.1 pyyaml-5.1.2 syft-0.1.23a1 tf-encrypted-0.5.8 websocket-client-0.56.0 websockets-8.0.2 zstd-1.4.1.0\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "G_Ui-1O3NrVA", + "colab_type": "code", + "colab": {} + }, + "source": [ + "from syft.frameworks.torch.differential_privacy import pate" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "enf7GqbBNrVF", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 56 + }, + "outputId": "a37783e4-2f0b-4c17-813d-a003295d9d70" + }, + "source": [ + "num_teachers, num_examples, num_labels = (100, 100, 10)\n", + "preds = (np.random.rand(num_teachers, num_examples) * num_labels).astype(int) #fake preds\n", + "indices = (np.random.rand(num_examples) * num_labels).astype(int) # true answers\n", + "\n", + "preds[:,0:10] *= 0\n", + "\n", + "data_dep_eps, data_ind_eps = pate.perform_analysis(teacher_preds=preds, indices=indices, noise_eps=0.1, delta=1e-5)\n", + "\n", + "assert data_dep_eps < data_ind_eps\n", + "\n" + ], + "execution_count": 87, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Warning: May not have used enough values of l. Increase 'moments' variable and run again.\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "rtcGSVu_NrVM", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 94 + }, + "outputId": "f9caf2c0-cd62-465b-88a6-56ea4df30626" + }, + "source": [ + "data_dep_eps, data_ind_eps = pate.perform_analysis(teacher_preds=preds, indices=indices, noise_eps=0.1, delta=1e-5)\n", + "print(\"Data Independent Epsilon:\", data_ind_eps)\n", + "print(\"Data Dependent Epsilon:\", data_dep_eps)" + ], + "execution_count": 88, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Warning: May not have used enough values of l. Increase 'moments' variable and run again.\n", + "Data Independent Epsilon: 11.756462732485115\n", + "Data Dependent Epsilon: 1.52655213289881\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "54GPud2ENrVU", + "colab_type": "code", + "colab": {} + }, + "source": [ + "preds[:,0:50] *= 0" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "fvxVNNEMNrVY", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + }, + "outputId": "ed19532a-5e94-420a-d998-81790178c03d" + }, + "source": [ + "data_dep_eps, data_ind_eps = pate.perform_analysis(teacher_preds=preds, indices=indices, noise_eps=0.1, delta=1e-5, moments=20)\n", + "print(\"Data Independent Epsilon:\", data_ind_eps)\n", + "print(\"Data Dependent Epsilon:\", data_dep_eps)" + ], + "execution_count": 90, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Data Independent Epsilon: 11.756462732485115\n", + "Data Dependent Epsilon: 0.9029013677789843\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "RY-gAWL9NrVd", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BmlFIVPaNrVk", + "colab_type": "text" + }, + "source": [ + "# Where to Go From Here\n", + "\n", + "\n", + "Read:\n", + " - Algorithmic Foundations of Differential Privacy: https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf\n", + " - Deep Learning with Differential Privacy: https://arxiv.org/pdf/1607.00133.pdf\n", + " - The Ethical Algorithm: https://www.amazon.com/Ethical-Algorithm-Science-Socially-Design/dp/0190948205\n", + " \n", + "Topics:\n", + " - The Exponential Mechanism\n", + " - The Moment's Accountant\n", + " - Differentially Private Stochastic Gradient Descent\n", + "\n", + "Advice:\n", + " - For deployments - stick with public frameworks!\n", + " - Join the Differential Privacy Community\n", + " - Don't get ahead of yourself - DP is still in the early days" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "wa9mH1R7NrVn", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "QhSsS_zsNrVz", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Q8GdrZB1NrV2", + "colab_type": "text" + }, + "source": [ + "# Section Project:\n", + "\n", + "For the final project for this section, you're going to train a DP model using this PATE method on the MNIST dataset, provided below." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "BsblHz0vNrV3", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 283 + }, + "outputId": "24c17397-92b9-48c1-8e9f-674dec175ee5" + }, + "source": [ + "import torchvision.datasets as datasets\n", + "mnist_trainset = datasets.MNIST(root='./data', train=True, download=True, transform=None)" + ], + "execution_count": 91, + "outputs": [ + { + "output_type": "stream", + "text": [ + "\r0it [00:00, ?it/s]" + ], + "name": "stderr" + }, + { + "output_type": "stream", + "text": [ + "Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz\n" + ], + "name": "stdout" + }, + { + "output_type": "stream", + "text": [ + "9920512it [00:01, 8729108.37it/s] \n" + ], + "name": "stderr" + }, + { + "output_type": "stream", + "text": [ + "Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz\n" + ], + "name": "stdout" + }, + { + "output_type": "stream", + "text": [ + " 0%| | 0/28881 [00:00 Date: Wed, 21 Aug 2019 17:43:31 +0530 Subject: [PATCH 05/14] Delete Section 2 - Federated Learning.ipynb --- Section 2 - Federated Learning.ipynb | 2325 -------------------------- 1 file changed, 2325 deletions(-) delete mode 100644 Section 2 - Federated Learning.ipynb diff --git a/Section 2 - Federated Learning.ipynb b/Section 2 - Federated Learning.ipynb deleted file mode 100644 index aa80b43..0000000 --- a/Section 2 - Federated Learning.ipynb +++ /dev/null @@ -1,2325 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Section: Federated Learning" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Introducing Federated Learning\n", - "\n", - "Federated Learning is a technique for training Deep Learning models on data to which you do not have access. Basically:\n", - "\n", - "Federated Learning: Instead of bringing all the data to one machine and training a model, we bring the model to the data, train it locally, and merely upload \"model updates\" to a central server.\n", - "\n", - "Use Cases:\n", - "\n", - " - app company (Texting prediction app)\n", - " - predictive maintenance (automobiles / industrial engines)\n", - " - wearable medical devices\n", - " - ad blockers / autotomplete in browsers (Firefox/Brave)\n", - " \n", - "Challenge Description: data is distributed amongst sources but we cannot aggregated it because of:\n", - "\n", - " - privacy concerns: legal, user discomfort, competitive dynamics\n", - " - engineering: the bandwidth/storage requirements of aggregating the larger dataset" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Introducing / Installing PySyft\n", - "\n", - "In order to perform Federated Learning, we need to be able to use Deep Learning techniques on remote machines. This will require a new set of tools. Specifically, we will use an extensin of PyTorch called PySyft.\n", - "\n", - "### Install PySyft\n", - "\n", - "The easiest way to install the required libraries is with [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/overview.html). Create a new environment, then install the dependencies in that environment. In your terminal:\n", - "\n", - "```bash\n", - "conda create -n pysyft python=3\n", - "conda activate pysyft # some older version of conda require \"source activate pysyft\" instead.\n", - "conda install jupyter notebook\n", - "pip install syft\n", - "pip install numpy\n", - "```\n", - "\n", - "If you have any errors relating to zstd - run the following (if everything above installed fine then skip this step):\n", - "\n", - "```\n", - "pip install --upgrade --force-reinstall zstd\n", - "```\n", - "\n", - "and then retry installing syft (pip install syft).\n", - "\n", - "If you are using Windows, I suggest installing [Anaconda and using the Anaconda Prompt](https://docs.anaconda.com/anaconda/user-guide/getting-started/) to work from the command line. \n", - "\n", - "With this environment activated and in the repo directory, launch Jupyter Notebook:\n", - "\n", - "```bash\n", - "jupyter notebook\n", - "```\n", - "\n", - "and re-open this notebook on the new Jupyter server.\n", - "\n", - "If any part of this doesn't work for you (or any of the tests fail) - first check the [README](https://github.com/OpenMined/PySyft.git) for installation help and then open a Github Issue or ping the #beginner channel in our slack! [slack.openmined.org](http://slack.openmined.org/)" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import torch as th" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([1, 2, 3, 4, 5])" - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x = th.tensor([1,2,3,4,5])\n", - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "y = x + x" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "tensor([ 2, 4, 6, 8, 10])\n" - ] - } - ], - "source": [ - "print(y)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "import syft as sy" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [], - "source": [ - "hook = sy.TorchHook(th)" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([1, 2, 3, 4, 5])" - ] - }, - "execution_count": 7, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "th.tensor([1,2,3,4,5])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Basic Remote Execution in PySyft" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## PySyft => Remote PyTorch\n", - "\n", - "The essence of Federated Learning is the ability to train models in parallel on a wide number of machines. Thus, we need the ability to tell remote machines to execute the operations required for Deep Learning.\n", - "\n", - "Thus, instead of using Torch tensors - we're now going to work with **pointers** to tensors. Let me show you what I mean. First, let's create a \"pretend\" machine owned by a \"pretend\" person - we'll call him Bob." - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "bob = sy.VirtualWorker(hook, id=\"bob\")" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 9, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4,5])" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [], - "source": [ - "x = x.send(bob)" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{50844634909: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 12, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 13, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x.location" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "50844634909" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x.id_at_location" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "85673375777" - ] - }, - "execution_count": 15, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x.id" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 16, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x.owner" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "hook.local_worker" - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[PointerTensor | me:85673375777 -> bob:50844634909]" - ] - }, - "execution_count": 18, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([1, 2, 3, 4, 5])" - ] - }, - "execution_count": 19, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x = x.get()\n", - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 20, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 20, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project: Playing with Remote Tensors\n", - "\n", - "In this project, I want you to .send() and .get() a tensor to TWO workers by calling .send(bob,alice). This will first require the creation of another VirtualWorker called alice." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# try this project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Introducing Remote Arithmetic" - ] - }, - { - "cell_type": "code", - "execution_count": 27, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4,5]).send(bob)\n", - "y = th.tensor([1,1,1,1,1]).send(bob)" - ] - }, - { - "cell_type": "code", - "execution_count": 28, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[PointerTensor | me:7344279461 -> bob:74976771769]" - ] - }, - "execution_count": 28, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 29, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[PointerTensor | me:68830170639 -> bob:70981483926]" - ] - }, - "execution_count": 29, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "y" - ] - }, - { - "cell_type": "code", - "execution_count": 30, - "metadata": {}, - "outputs": [], - "source": [ - "z = x + y" - ] - }, - { - "cell_type": "code", - "execution_count": 31, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[PointerTensor | me:79875157913 -> bob:79875157913]" - ] - }, - "execution_count": 31, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z" - ] - }, - { - "cell_type": "code", - "execution_count": 32, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([2, 3, 4, 5, 6])" - ] - }, - "execution_count": 32, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = z.get()\n", - "z" - ] - }, - { - "cell_type": "code", - "execution_count": 33, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[PointerTensor | me:28437210120 -> bob:28437210120]" - ] - }, - "execution_count": 33, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = th.add(x,y)\n", - "z" - ] - }, - { - "cell_type": "code", - "execution_count": 34, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([2, 3, 4, 5, 6])" - ] - }, - "execution_count": 34, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = z.get()\n", - "z" - ] - }, - { - "cell_type": "code", - "execution_count": 35, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1.,2,3,4,5], requires_grad=True).send(bob)\n", - "y = th.tensor([1.,1,1,1,1], requires_grad=True).send(bob)" - ] - }, - { - "cell_type": "code", - "execution_count": 36, - "metadata": {}, - "outputs": [], - "source": [ - "z = (x + y).sum()" - ] - }, - { - "cell_type": "code", - "execution_count": 37, - "metadata": {}, - "outputs": [], - "source": [ - "z.backward()" - ] - }, - { - "cell_type": "code", - "execution_count": 38, - "metadata": {}, - "outputs": [], - "source": [ - "x = x.get()" - ] - }, - { - "cell_type": "code", - "execution_count": 39, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([1., 2., 3., 4., 5.], requires_grad=True)" - ] - }, - "execution_count": 39, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 40, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([1., 1., 1., 1., 1.])" - ] - }, - "execution_count": 40, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x.grad" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project: Learn a Simple Linear Model\n", - "\n", - "In this project, I'd like for you to create a simple linear model which will solve for the following dataset below. You should use only Variables and .backward() to do so (no optimizers or nn.Modules). Furthermore, you must do so with both the data and the model being located on Bob's machine." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# try this project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Garbage Collection and Common Errors\n" - ] - }, - { - "cell_type": "code", - "execution_count": 44, - "metadata": {}, - "outputs": [], - "source": [ - "bob = bob.clear_objects()" - ] - }, - { - "cell_type": "code", - "execution_count": 45, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 45, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 46, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4,5]).send(bob)" - ] - }, - { - "cell_type": "code", - "execution_count": 47, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{23831414651: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 47, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 48, - "metadata": {}, - "outputs": [], - "source": [ - "del x" - ] - }, - { - "cell_type": "code", - "execution_count": 49, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 49, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 50, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4,5]).send(bob)" - ] - }, - { - "cell_type": "code", - "execution_count": 51, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{92212512027: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 51, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 52, - "metadata": {}, - "outputs": [], - "source": [ - "x = \"asdf\"" - ] - }, - { - "cell_type": "code", - "execution_count": 53, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 53, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 54, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4,5]).send(bob)" - ] - }, - { - "cell_type": "code", - "execution_count": 55, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[PointerTensor | me:19384969793 -> bob:49166137090]" - ] - }, - "execution_count": 55, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 56, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{49166137090: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 56, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 57, - "metadata": {}, - "outputs": [], - "source": [ - "x = \"asdf\"" - ] - }, - { - "cell_type": "code", - "execution_count": 58, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{49166137090: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 58, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 59, - "metadata": {}, - "outputs": [], - "source": [ - "del x" - ] - }, - { - "cell_type": "code", - "execution_count": 60, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{49166137090: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 60, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 61, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 61, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob = bob.clear_objects()\n", - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 62, - "metadata": {}, - "outputs": [], - "source": [ - "for i in range(1000):\n", - " x = th.tensor([1,2,3,4,5]).send(bob)" - ] - }, - { - "cell_type": "code", - "execution_count": 63, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{17426510898: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 63, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 64, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4,5]).send(bob)\n", - "y = th.tensor([1,1,1,1,1])" - ] - }, - { - "cell_type": "code", - "execution_count": 65, - "metadata": {}, - "outputs": [ - { - "ename": "TensorsNotCollocatedException", - "evalue": "You tried to call a method involving two tensors where one tensor is actually locatedon another machine (is a PointerTensor). Call .get() on the PointerTensor or .send(bob) on the other tensor.\n\nTensor A: [PointerTensor | me:46419059800 -> bob:14412738960]\nTensor B: tensor([1, 1, 1, 1, 1])", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mPureTorchTensorFoundError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 561\u001b[0m new_self, new_args = syft.frameworks.torch.hook_args.hook_method_args(\n\u001b[0;32m--> 562\u001b[0;31m \u001b[0mmethod_name\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 563\u001b[0m )\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36mhook_method_args\u001b[0;34m(attr, method_self, args)\u001b[0m\n\u001b[1;32m 85\u001b[0m \u001b[0;31m# Try running it\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 86\u001b[0;31m \u001b[0mnew_self\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnew_args\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mhook_args\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmethod_self\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 87\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 270\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 271\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 272\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36mtwo_fold\u001b[0;34m(lambdas, args)\u001b[0m\n\u001b[1;32m 420\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mtwo_fold\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 421\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mlambdas\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlambdas\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 422\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 270\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 271\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 272\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36mtuple_one_fold\u001b[0;34m(lambdas, args)\u001b[0m\n\u001b[1;32m 414\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mtuple_one_fold\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 415\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 416\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(i)\u001b[0m\n\u001b[1;32m 248\u001b[0m \u001b[0;31m# Last if not, rule is probably == 1 so use type to return the right transformation.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 249\u001b[0;31m \u001b[0;32melse\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mforward_func\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 250\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mr\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mzip\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mrules\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# And do this for all the args / rules provided\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(i)\u001b[0m\n\u001b[1;32m 34\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"child\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 35\u001b[0;31m \u001b[0;32melse\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0m_\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0m_\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mthrow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mPureTorchTensorFoundError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 36\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mParameter\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchild\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(.0)\u001b[0m\n\u001b[1;32m 34\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"child\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 35\u001b[0;31m \u001b[0;32melse\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0m_\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0m_\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mthrow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mPureTorchTensorFoundError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 36\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mParameter\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchild\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mPureTorchTensorFoundError\u001b[0m: tensor([1, 1, 1, 1, 1])", - "\nDuring handling of the above exception, another exception occurred:\n", - "\u001b[0;31mTensorsNotCollocatedException\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mz\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 564\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mBaseException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 565\u001b[0m \u001b[0;31m# we can make some errors more descriptive with this method\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 566\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mroute_method_exception\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 567\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 568\u001b[0m \u001b[0;31m# Send the new command to the appropriate class and get the response\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mTensorsNotCollocatedException\u001b[0m: You tried to call a method involving two tensors where one tensor is actually locatedon another machine (is a PointerTensor). Call .get() on the PointerTensor or .send(bob) on the other tensor.\n\nTensor A: [PointerTensor | me:46419059800 -> bob:14412738960]\nTensor B: tensor([1, 1, 1, 1, 1])" - ] - } - ], - "source": [ - "z = x + y" - ] - }, - { - "cell_type": "code", - "execution_count": 130, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4,5]).send(bob)\n", - "y = th.tensor([1,1,1,1,1]).send(alice)" - ] - }, - { - "cell_type": "code", - "execution_count": 66, - "metadata": {}, - "outputs": [ - { - "ename": "TensorsNotCollocatedException", - "evalue": "You tried to call a method involving two tensors where one tensor is actually locatedon another machine (is a PointerTensor). Call .get() on the PointerTensor or .send(bob) on the other tensor.\n\nTensor A: [PointerTensor | me:46419059800 -> bob:14412738960]\nTensor B: tensor([1, 1, 1, 1, 1])", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mPureTorchTensorFoundError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 561\u001b[0m new_self, new_args = syft.frameworks.torch.hook_args.hook_method_args(\n\u001b[0;32m--> 562\u001b[0;31m \u001b[0mmethod_name\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 563\u001b[0m )\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36mhook_method_args\u001b[0;34m(attr, method_self, args)\u001b[0m\n\u001b[1;32m 85\u001b[0m \u001b[0;31m# Try running it\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 86\u001b[0;31m \u001b[0mnew_self\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnew_args\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mhook_args\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmethod_self\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 87\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 270\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 271\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 272\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36mtwo_fold\u001b[0;34m(lambdas, args)\u001b[0m\n\u001b[1;32m 420\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mtwo_fold\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 421\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mlambdas\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlambdas\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 422\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 270\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 271\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 272\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36mtuple_one_fold\u001b[0;34m(lambdas, args)\u001b[0m\n\u001b[1;32m 414\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mtuple_one_fold\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 415\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 416\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(i)\u001b[0m\n\u001b[1;32m 248\u001b[0m \u001b[0;31m# Last if not, rule is probably == 1 so use type to return the right transformation.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 249\u001b[0;31m \u001b[0;32melse\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mforward_func\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 250\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mr\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mzip\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mrules\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# And do this for all the args / rules provided\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(i)\u001b[0m\n\u001b[1;32m 34\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"child\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 35\u001b[0;31m \u001b[0;32melse\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0m_\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0m_\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mthrow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mPureTorchTensorFoundError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 36\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mParameter\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchild\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(.0)\u001b[0m\n\u001b[1;32m 34\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"child\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 35\u001b[0;31m \u001b[0;32melse\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0m_\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0m_\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mthrow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mPureTorchTensorFoundError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 36\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mParameter\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchild\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mPureTorchTensorFoundError\u001b[0m: tensor([1, 1, 1, 1, 1])", - "\nDuring handling of the above exception, another exception occurred:\n", - "\u001b[0;31mTensorsNotCollocatedException\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mz\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 564\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mBaseException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 565\u001b[0m \u001b[0;31m# we can make some errors more descriptive with this method\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 566\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mroute_method_exception\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 567\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 568\u001b[0m \u001b[0;31m# Send the new command to the appropriate class and get the response\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mTensorsNotCollocatedException\u001b[0m: You tried to call a method involving two tensors where one tensor is actually locatedon another machine (is a PointerTensor). Call .get() on the PointerTensor or .send(bob) on the other tensor.\n\nTensor A: [PointerTensor | me:46419059800 -> bob:14412738960]\nTensor B: tensor([1, 1, 1, 1, 1])" - ] - } - ], - "source": [ - "z = x + y" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Toy Federated Learning\n", - "\n", - "Let's start by training a toy model the centralized way. This is about a simple as models get. We first need:\n", - "\n", - "- a toy dataset\n", - "- a model\n", - "- some basic training logic for training a model to fit the data." - ] - }, - { - "cell_type": "code", - "execution_count": 69, - "metadata": {}, - "outputs": [], - "source": [ - "from torch import nn, optim" - ] - }, - { - "cell_type": "code", - "execution_count": 67, - "metadata": {}, - "outputs": [], - "source": [ - "# A Toy Dataset\n", - "data = th.tensor([[1.,1],[0,1],[1,0],[0,0]], requires_grad=True)\n", - "target = th.tensor([[1.],[1], [0], [0]], requires_grad=True)" - ] - }, - { - "cell_type": "code", - "execution_count": 70, - "metadata": {}, - "outputs": [], - "source": [ - "# A Toy Model\n", - "model = nn.Linear(2,1)" - ] - }, - { - "cell_type": "code", - "execution_count": 71, - "metadata": {}, - "outputs": [], - "source": [ - "opt = optim.SGD(params=model.parameters(), lr=0.1)" - ] - }, - { - "cell_type": "code", - "execution_count": 85, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "tensor(0.0860)\n", - "tensor(0.0586)\n", - "tensor(0.0402)\n", - "tensor(0.0278)\n", - "tensor(0.0194)\n", - "tensor(0.0136)\n", - "tensor(0.0096)\n", - "tensor(0.0069)\n", - "tensor(0.0049)\n", - "tensor(0.0036)\n", - "tensor(0.0026)\n", - "tensor(0.0019)\n", - "tensor(0.0014)\n", - "tensor(0.0010)\n", - "tensor(0.0008)\n", - "tensor(0.0006)\n", - "tensor(0.0004)\n", - "tensor(0.0003)\n", - "tensor(0.0002)\n", - "tensor(0.0002)\n" - ] - } - ], - "source": [ - "def train(iterations=20):\n", - " for iter in range(iterations):\n", - " opt.zero_grad()\n", - "\n", - " pred = model(data)\n", - "\n", - " loss = ((pred - target)**2).sum()\n", - "\n", - " loss.backward()\n", - "\n", - " opt.step()\n", - "\n", - " print(loss.data)\n", - " \n", - "train()" - ] - }, - { - "cell_type": "code", - "execution_count": 89, - "metadata": {}, - "outputs": [], - "source": [ - "data_bob = data[0:2].send(bob)\n", - "target_bob = target[0:2].send(bob)" - ] - }, - { - "cell_type": "code", - "execution_count": 90, - "metadata": {}, - "outputs": [], - "source": [ - "data_alice = data[2:4].send(alice)\n", - "target_alice = target[2:4].send(alice)" - ] - }, - { - "cell_type": "code", - "execution_count": 91, - "metadata": {}, - "outputs": [], - "source": [ - "datasets = [(data_bob, target_bob), (data_alice, target_alice)]" - ] - }, - { - "cell_type": "code", - "execution_count": 122, - "metadata": {}, - "outputs": [], - "source": [ - "def train(iterations=20):\n", - "\n", - " model = nn.Linear(2,1)\n", - " opt = optim.SGD(params=model.parameters(), lr=0.1)\n", - " \n", - " for iter in range(iterations):\n", - "\n", - " for _data, _target in datasets:\n", - "\n", - " # send model to the data\n", - " model = model.send(_data.location)\n", - "\n", - " # do normal training\n", - " opt.zero_grad()\n", - " pred = model(_data)\n", - " loss = ((pred - _target)**2).sum()\n", - " loss.backward()\n", - " opt.step()\n", - "\n", - " # get smarter model back\n", - " model = model.get()\n", - "\n", - " print(loss.get())" - ] - }, - { - "cell_type": "code", - "execution_count": 123, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "tensor(0.0824, requires_grad=True)\n", - "tensor(0.0006, requires_grad=True)\n", - "tensor(5.0509e-06, requires_grad=True)\n", - "tensor(1.8660e-07, requires_grad=True)\n", - "tensor(1.2442e-07, requires_grad=True)\n", - "tensor(1.0333e-07, requires_grad=True)\n", - "tensor(8.5991e-08, requires_grad=True)\n", - "tensor(7.1573e-08, requires_grad=True)\n", - "tensor(5.9562e-08, requires_grad=True)\n", - "tensor(4.9569e-08, requires_grad=True)\n", - "tensor(4.1269e-08, requires_grad=True)\n", - "tensor(3.4340e-08, requires_grad=True)\n", - "tensor(2.8568e-08, requires_grad=True)\n", - "tensor(2.3789e-08, requires_grad=True)\n", - "tensor(1.9802e-08, requires_grad=True)\n", - "tensor(1.6471e-08, requires_grad=True)\n", - "tensor(1.3715e-08, requires_grad=True)\n", - "tensor(1.1405e-08, requires_grad=True)\n", - "tensor(9.4892e-09, requires_grad=True)\n", - "tensor(7.9073e-09, requires_grad=True)\n", - "tensor(6.5776e-09, requires_grad=True)\n", - "tensor(5.4857e-09, requires_grad=True)\n", - "tensor(4.5577e-09, requires_grad=True)\n", - "tensor(3.8027e-09, requires_grad=True)\n", - "tensor(3.1617e-09, requires_grad=True)\n", - "tensor(2.6215e-09, requires_grad=True)\n", - "tensor(2.1910e-09, requires_grad=True)\n", - "tensor(1.8242e-09, requires_grad=True)\n", - "tensor(1.5201e-09, requires_grad=True)\n", - "tensor(1.2593e-09, requires_grad=True)\n", - "tensor(1.0471e-09, requires_grad=True)\n", - "tensor(8.7642e-10, requires_grad=True)\n", - "tensor(7.2482e-10, requires_grad=True)\n", - "tensor(6.0216e-10, requires_grad=True)\n", - "tensor(5.0250e-10, requires_grad=True)\n", - "tensor(4.2101e-10, requires_grad=True)\n", - "tensor(3.5084e-10, requires_grad=True)\n", - "tensor(2.8831e-10, requires_grad=True)\n", - "tensor(2.4171e-10, requires_grad=True)\n", - "tensor(2.0130e-10, requires_grad=True)\n" - ] - } - ], - "source": [ - "train()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Advanced Remote Execution Tools\n", - "\n", - "In the last section we trained a toy model using Federated Learning. We did this by calling .send() and .get() on our model, sending it to the location of training data, updating it, and then bringing it back. However, at the end of the example we realized that we needed to go a bit further to protect people privacy. Namely, we want to average the gradients BEFORE calling .get(). That way, we won't ever see anyone's exact gradient (thus better protecting their privacy!!!)\n", - "\n", - "But, in order to do this, we need a few more pieces:\n", - "\n", - "- use a pointer to send a Tensor directly to another worker\n", - "\n", - "And in addition, while we're here, we're going to learn about a few more advanced tensor operations as well which will help us both with this example and a few in the future!" - ] - }, - { - "cell_type": "code", - "execution_count": 163, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 163, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob.clear_objects()\n", - "alice.clear_objects()" - ] - }, - { - "cell_type": "code", - "execution_count": 164, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 164, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 167, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4,5]).send(bob)" - ] - }, - { - "cell_type": "code", - "execution_count": 170, - "metadata": {}, - "outputs": [], - "source": [ - "x = x.send(alice)" - ] - }, - { - "cell_type": "code", - "execution_count": 171, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{21931995238: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 171, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 172, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{55299383133: (Wrapper)>[PointerTensor | alice:55299383133 -> bob:21931995238]}" - ] - }, - "execution_count": 172, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "alice._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 174, - "metadata": {}, - "outputs": [], - "source": [ - "y = x + x" - ] - }, - { - "cell_type": "code", - "execution_count": 175, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[PointerTensor | me:48924169518 -> alice:48924169518]" - ] - }, - "execution_count": 175, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "y" - ] - }, - { - "cell_type": "code", - "execution_count": 176, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{21931995238: tensor([1, 2, 3, 4, 5]),\n", - " 48924169518: tensor([ 2, 4, 6, 8, 10])}" - ] - }, - "execution_count": 176, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 177, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{48924169518: (Wrapper)>[PointerTensor | alice:48924169518 -> bob:48924169518],\n", - " 55299383133: (Wrapper)>[PointerTensor | alice:55299383133 -> bob:21931995238]}" - ] - }, - "execution_count": 177, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "alice._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 178, - "metadata": {}, - "outputs": [], - "source": [ - "jon = sy.VirtualWorker(hook, id=\"jon\")" - ] - }, - { - "cell_type": "code", - "execution_count": 186, - "metadata": {}, - "outputs": [], - "source": [ - "bob.clear_objects()\n", - "alice.clear_objects()\n", - "\n", - "x = th.tensor([1,2,3,4,5]).send(bob).send(alice)" - ] - }, - { - "cell_type": "code", - "execution_count": 187, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{10539507281: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 187, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 188, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{40541026396: (Wrapper)>[PointerTensor | alice:40541026396 -> bob:10539507281]}" - ] - }, - "execution_count": 188, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "alice._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 189, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[PointerTensor | me:40541026396 -> bob:10539507281]" - ] - }, - "execution_count": 189, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x = x.get()\n", - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 190, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{10539507281: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 190, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 191, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 191, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "alice._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 192, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([1, 2, 3, 4, 5])" - ] - }, - "execution_count": 192, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x = x.get()\n", - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 193, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 193, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 194, - "metadata": {}, - "outputs": [], - "source": [ - "bob.clear_objects()\n", - "alice.clear_objects()\n", - "\n", - "x = th.tensor([1,2,3,4,5]).send(bob).send(alice)" - ] - }, - { - "cell_type": "code", - "execution_count": 195, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{28843833290: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 195, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 196, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{88814770977: (Wrapper)>[PointerTensor | alice:88814770977 -> bob:28843833290]}" - ] - }, - "execution_count": 196, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "alice._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 197, - "metadata": {}, - "outputs": [], - "source": [ - "del x" - ] - }, - { - "cell_type": "code", - "execution_count": 198, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 198, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 199, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 199, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "alice._objects" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Pointer Chain Operations" - ] - }, - { - "cell_type": "code", - "execution_count": 212, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 212, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob.clear_objects()\n", - "alice.clear_objects()" - ] - }, - { - "cell_type": "code", - "execution_count": 213, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4,5]).send(bob)" - ] - }, - { - "cell_type": "code", - "execution_count": 214, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{81966670653: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 214, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 215, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 215, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "alice._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 216, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[PointerTensor | me:547575813 -> alice:547575813]" - ] - }, - "execution_count": 216, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x.move(alice)" - ] - }, - { - "cell_type": "code", - "execution_count": 217, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 217, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 218, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{547575813: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 218, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "alice._objects" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 202, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4,5]).send(bob).send(alice)" - ] - }, - { - "cell_type": "code", - "execution_count": 203, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{75487377866: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 203, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 204, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{94092707138: (Wrapper)>[PointerTensor | alice:94092707138 -> bob:75487377866]}" - ] - }, - "execution_count": 204, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "alice._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 205, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[PointerTensor | me:87205391815 -> alice:94092707138]" - ] - }, - "execution_count": 205, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x.remote_get()" - ] - }, - { - "cell_type": "code", - "execution_count": 206, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{}" - ] - }, - "execution_count": 206, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 207, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{94092707138: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 207, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "alice._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 208, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[PointerTensor | me:87205391815 -> bob:87205391815]" - ] - }, - "execution_count": 208, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x.move(bob)" - ] - }, - { - "cell_type": "code", - "execution_count": 209, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[PointerTensor | me:87205391815 -> bob:87205391815]" - ] - }, - "execution_count": 209, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 210, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{87205391815: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 210, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "code", - "execution_count": 211, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{94092707138: tensor([1, 2, 3, 4, 5])}" - ] - }, - "execution_count": 211, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "alice._objects" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.1" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} From ddc38927e284247077f08f7b7edddd95844abb20 Mon Sep 17 00:00:00 2001 From: Nimanshi Jha <40792134+nimanshijha@users.noreply.github.com> Date: Wed, 21 Aug 2019 18:21:08 +0530 Subject: [PATCH 06/14] Add files via upload --- Section_2_Federated_Learning.ipynb | 3122 ++++++++++++++++++++++++++++ 1 file changed, 3122 insertions(+) create mode 100644 Section_2_Federated_Learning.ipynb diff --git a/Section_2_Federated_Learning.ipynb b/Section_2_Federated_Learning.ipynb new file mode 100644 index 0000000..71aa515 --- /dev/null +++ b/Section_2_Federated_Learning.ipynb @@ -0,0 +1,3122 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.1" + }, + "colab": { + "name": "Section 2 - Federated Learning.ipynb", + "version": "0.3.2", + "provenance": [] + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "AcXECkyaE2T9", + "colab_type": "text" + }, + "source": [ + "# Section: Federated Learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Agkm5UYcE2T_", + "colab_type": "text" + }, + "source": [ + "# Lesson: Introducing Federated Learning\n", + "\n", + "Federated Learning is a technique for training Deep Learning models on data to which you do not have access. Basically:\n", + "\n", + "Federated Learning: Instead of bringing all the data to one machine and training a model, we bring the model to the data, train it locally, and merely upload \"model updates\" to a central server.\n", + "\n", + "Use Cases:\n", + "\n", + " - app company (Texting prediction app)\n", + " - predictive maintenance (automobiles / industrial engines)\n", + " - wearable medical devices\n", + " - ad blockers / autotomplete in browsers (Firefox/Brave)\n", + " \n", + "Challenge Description: data is distributed amongst sources but we cannot aggregated it because of:\n", + "\n", + " - privacy concerns: legal, user discomfort, competitive dynamics\n", + " - engineering: the bandwidth/storage requirements of aggregating the larger dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "zrya73T-E2UA", + "colab_type": "text" + }, + "source": [ + "# Lesson: Introducing / Installing PySyft\n", + "\n", + "In order to perform Federated Learning, we need to be able to use Deep Learning techniques on remote machines. This will require a new set of tools. Specifically, we will use an extensin of PyTorch called PySyft.\n", + "\n", + "### Install PySyft\n", + "\n", + "The easiest way to install the required libraries is with [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/overview.html). Create a new environment, then install the dependencies in that environment. In your terminal:\n", + "\n", + "```bash\n", + "conda create -n pysyft python=3\n", + "conda activate pysyft # some older version of conda require \"source activate pysyft\" instead.\n", + "conda install jupyter notebook\n", + "pip install syft\n", + "pip install numpy\n", + "```\n", + "\n", + "If you have any errors relating to zstd - run the following (if everything above installed fine then skip this step):\n", + "\n", + "```\n", + "pip install --upgrade --force-reinstall zstd\n", + "```\n", + "\n", + "and then retry installing syft (pip install syft).\n", + "\n", + "If you are using Windows, I suggest installing [Anaconda and using the Anaconda Prompt](https://docs.anaconda.com/anaconda/user-guide/getting-started/) to work from the command line. \n", + "\n", + "With this environment activated and in the repo directory, launch Jupyter Notebook:\n", + "\n", + "```bash\n", + "jupyter notebook\n", + "```\n", + "\n", + "and re-open this notebook on the new Jupyter server.\n", + "\n", + "If any part of this doesn't work for you (or any of the tests fail) - first check the [README](https://github.com/OpenMined/PySyft.git) for installation help and then open a Github Issue or ping the #beginner channel in our slack! [slack.openmined.org](http://slack.openmined.org/)" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "P6d61U1GE2UC", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import torch as th" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "gdYSOinGE2UH", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "a345b87a-a158-478d-b1ba-aecc767938a3" + }, + "source": [ + "x = th.tensor([1,2,3,4,5])\n", + "x" + ], + "execution_count": 2, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1, 2, 3, 4, 5])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 2 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "aPlABSqbE2UO", + "colab_type": "code", + "colab": {} + }, + "source": [ + "y = x + x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "BNBOkvCyE2US", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "5c8e5a28-b7c8-4524-b415-e1c3a12c23af" + }, + "source": [ + "print(y)" + ], + "execution_count": 4, + "outputs": [ + { + "output_type": "stream", + "text": [ + "tensor([ 2, 4, 6, 8, 10])\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "hDoz6wlxE2Uf", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import syft as sy" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "bxLh65mzE2Uj", + "colab_type": "code", + "colab": {} + }, + "source": [ + "hook = sy.TorchHook(th)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "puaKByDqE2Uo", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "9fddea4e-6368-4cfe-bf7e-0ee22f802dd2" + }, + "source": [ + "th.tensor([1,2,3,4,5])" + ], + "execution_count": 9, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1, 2, 3, 4, 5])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 9 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "245kHTL1E2Uw", + "colab_type": "text" + }, + "source": [ + "# Lesson: Basic Remote Execution in PySyft" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "hOt46k6-E2Ux", + "colab_type": "text" + }, + "source": [ + "## PySyft => Remote PyTorch\n", + "\n", + "The essence of Federated Learning is the ability to train models in parallel on a wide number of machines. Thus, we need the ability to tell remote machines to execute the operations required for Deep Learning.\n", + "\n", + "Thus, instead of using Torch tensors - we're now going to work with **pointers** to tensors. Let me show you what I mean. First, let's create a \"pretend\" machine owned by a \"pretend\" person - we'll call him Bob." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "GAcDewj9E2Uy", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob = sy.VirtualWorker(hook, id=\"bob\")" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "eF8A4RXKE2U9", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "6a29a70e-b709-40b8-f3ef-7c4a30a2765f" + }, + "source": [ + "bob._objects" + ], + "execution_count": 11, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 11 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "UW9QqbUtE2VF", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4,5])" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "BoY9kVWXE2VP", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = x.send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "IL6JkVT4E2VV", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "d1c57403-4932-4781-ab24-c9876def197c" + }, + "source": [ + "bob._objects" + ], + "execution_count": 14, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{99731784642: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 14 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "9dhalhOvE2Vd", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "4a87d064-7e32-4d3f-8b95-152dc3bf9ea7" + }, + "source": [ + "x.location" + ], + "execution_count": 15, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 15 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "cnBclugNE2Vh", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "9aeea873-44af-4124-8069-3a171e7cb50d" + }, + "source": [ + "x.id_at_location" + ], + "execution_count": 16, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "99731784642" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 16 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "VI5Dt1ZEE2Vk", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "345f154a-130b-495a-b6e3-449151c9fffb" + }, + "source": [ + "x.id" + ], + "execution_count": 17, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "80204050205" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 17 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "eQuDOeqBE2Vp", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "941e49be-4409-41d3-fc3f-69b3af299ded" + }, + "source": [ + "x.owner" + ], + "execution_count": 18, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 18 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "mNgt1qxTE2V3", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "4f729f7b-55f7-4361-e29f-14919509abac" + }, + "source": [ + "hook.local_worker" + ], + "execution_count": 19, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 19 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Vi-wPIUkE2WA", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "ace409f0-b489-46b0-b423-a3f6cb06fb4d" + }, + "source": [ + "x" + ], + "execution_count": 20, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:80204050205 -> bob:99731784642]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 20 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "m9ivrLqlE2WR", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "0349c771-97db-47d3-e9c0-ddca7ee4749f" + }, + "source": [ + "x = x.get()\n", + "x" + ], + "execution_count": 21, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1, 2, 3, 4, 5])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 21 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "g8C1OBqLE2Wd", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "d4d14b37-10e1-47f7-c19d-1a4a612b79cf" + }, + "source": [ + "bob._objects" + ], + "execution_count": 22, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 22 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WjcaBHZUE2Wh", + "colab_type": "text" + }, + "source": [ + "# Project: Playing with Remote Tensors\n", + "\n", + "In this project, I want you to .send() and .get() a tensor to TWO workers by calling .send(bob,alice). This will first require the creation of another VirtualWorker called alice." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "K38H0YQWE2Wi", + "colab_type": "code", + "colab": {} + }, + "source": [ + "alice = sy.VirtualWorker(hook, id=\"alice\")" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "D2wD_WDwE2Wn", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x= th.tensor([1,2,3,4,5])" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "aelOZFwIE2Wx", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x_ptr = x.send(bob, alice)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "K0tkhUt1E2W1", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "dfeea965-35fd-44d1-b052-491a87b54ce4" + }, + "source": [ + "x_ptr.get()" + ], + "execution_count": 26, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "[tensor([1, 2, 3, 4, 5]), tensor([1, 2, 3, 4, 5])]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 26 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "g7fVz2n8E2W4", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4,5]).send(bob, alice)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Khrv2ZbnE2W7", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "a9968ae4-be78-47bc-d303-ea116d05dc95" + }, + "source": [ + "x.get(sum_results=True)" + ], + "execution_count": 28, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([ 2, 4, 6, 8, 10])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 28 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "OgYKx9iXE2XP", + "colab_type": "text" + }, + "source": [ + "# Lesson: Introducing Remote Arithmetic" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "dCbwJvV1E2XS", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4,5]).send(bob)\n", + "y = th.tensor([1,1,1,1,1]).send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "cXA9CyfgE2XW", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "8f01ccde-8bec-4206-ae2e-20a54cc18f73" + }, + "source": [ + "x" + ], + "execution_count": 44, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:84248250794 -> bob:64032220097]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 44 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "cbikEs3sE2Xc", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "a18ad66b-99ca-499d-fce9-ef5e996cb0d2" + }, + "source": [ + "y" + ], + "execution_count": 45, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:66658247205 -> bob:62920035025]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 45 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "8LoAnVp3E2Xh", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x + y" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Y5_rZ4cdE2Xk", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "34c9aa52-90b2-4ab1-9741-33b0d1056c05" + }, + "source": [ + "z" + ], + "execution_count": 47, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:7549089470 -> bob:99286676007]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 47 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "fn96z_h5E2Xn", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "97c28b05-6cf3-40ec-890f-961fa4df9f36" + }, + "source": [ + "z = z.get()\n", + "z" + ], + "execution_count": 48, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([2, 3, 4, 5, 6])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 48 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "fctWInVaE2Xu", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "5248a973-c70d-41f0-bc04-fbe73b5e07f7" + }, + "source": [ + "z = th.add(x,y)\n", + "z" + ], + "execution_count": 35, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:50535594766 -> bob:51540055303]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 35 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "l5f2LR0qE2X9", + "colab_type": "code", + "colab": {}, + "outputId": "52359fdc-3803-4566-ad73-1febff74005d" + }, + "source": [ + "z = z.get()\n", + "z" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([2, 3, 4, 5, 6])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 34 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "eGgQ_KyfE2YG", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1.,2,3,4,5], requires_grad=True).send(bob)\n", + "y = th.tensor([1.,1,1,1,1], requires_grad=True).send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "GQ6cZfXZE2YO", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = (x + y).sum()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "8VQc2mGIE2YS", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "05a6d540-22bd-4c82-9b0a-95b276b66891" + }, + "source": [ + "z.backward()" + ], + "execution_count": 38, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:73643338848 -> bob:22775581503]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 38 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "iK2WcPcfE2YV", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = x.get()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "3U1T13QIE2YY", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "4ad2cd53-977f-41fb-ca0f-4c646403cc37" + }, + "source": [ + "x" + ], + "execution_count": 40, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1., 2., 3., 4., 5.], requires_grad=True)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 40 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "9uzsSomoE2Yb", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "8acb12d4-dbfa-4699-e230-4a3c4cf4382d" + }, + "source": [ + "x.grad" + ], + "execution_count": 41, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1., 1., 1., 1., 1.])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 41 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UprYszJBE2Yv", + "colab_type": "text" + }, + "source": [ + "# Project: Learn a Simple Linear Model\n", + "\n", + "In this project, I'd like for you to create a simple linear model which will solve for the following dataset below. You should use only Variables and .backward() to do so (no optimizers or nn.Modules). Furthermore, you must do so with both the data and the model being located on Bob's machine." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "JbRRAuHNE2Yw", + "colab_type": "code", + "colab": {} + }, + "source": [ + "input = th.tensor([[1.,1],[0,1,],[1,0],[0,0]], requires_grad=True).send(bob)\n", + "target = th.tensor([[1.],[1],[0],[0]], requires_grad=True).send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Cpr5V1oQE2Y1", + "colab_type": "code", + "colab": {} + }, + "source": [ + "weights = th.tensor([[0.],[0.]], requires_grad=True).send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "RZCHcOOYE2Y5", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 207 + }, + "outputId": "6acb5812-dfc1-4d12-995e-f65027098d52" + }, + "source": [ + "for i in range(10) :\n", + " \n", + " pred = input.mm(weights)\n", + " \n", + " loss = ((pred - target)**2).sum()\n", + " \n", + " loss.backward()\n", + " \n", + " weights.data.sub_(weights.grad * 0.1)\n", + " weights.grad *= 0\n", + " \n", + " print(loss.get().data)" + ], + "execution_count": 55, + "outputs": [ + { + "output_type": "stream", + "text": [ + "tensor(2.)\n", + "tensor(0.5600)\n", + "tensor(0.2432)\n", + "tensor(0.1372)\n", + "tensor(0.0849)\n", + "tensor(0.0538)\n", + "tensor(0.0344)\n", + "tensor(0.0220)\n", + "tensor(0.0141)\n", + "tensor(0.0090)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "lDuciVmVE2ZX", + "colab_type": "text" + }, + "source": [ + "# Lesson: Garbage Collection and Common Errors\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "zeeBmdeBE2ZY", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob = bob.clear_objects()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "c_dG8qyjE2Zc", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "4c97ae7b-7a9c-4658-ec2a-59dcb5fce0bd" + }, + "source": [ + "bob._objects" + ], + "execution_count": 57, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 57 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "7E_AvXqhE2Zm", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4,5]).send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "FtD-K110E2Z0", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "6c5788fb-4fd0-470d-dd7f-bf7299a215fd" + }, + "source": [ + "bob._objects" + ], + "execution_count": 59, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{55915786223: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 59 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "7TBlYOTrE2Z7", + "colab_type": "code", + "colab": {} + }, + "source": [ + "del x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "3AnNCwPxE2aS", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "8d4bb98c-174f-4b2d-8d18-15d27f108ed9" + }, + "source": [ + "bob._objects" + ], + "execution_count": 61, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 61 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "OfkRg6TtE2aY", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4,5]).send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "tZZ3xlE7E2ad", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "87b0de30-a0d6-4ee6-9105-0e12270aeba3" + }, + "source": [ + "bob._objects" + ], + "execution_count": 63, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{21618928272: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 63 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "LhXfiElLE2ao", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = \"asdf\"" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Vu5OvXnZE2at", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "4d1d451b-064a-437e-a7d3-223d08ec8ac2" + }, + "source": [ + "bob._objects" + ], + "execution_count": 65, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 65 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "FU4hho_TE2ax", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4,5]).send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "h3HIbeUpE2a1", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "9a432b9c-296f-4b8c-fbc1-8e4b661fa7f1" + }, + "source": [ + "x" + ], + "execution_count": 67, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:36775045188 -> bob:37172090972]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 67 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "phW08mflE2a5", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "196558bd-c605-4d35-8765-3d38cc3dda9d" + }, + "source": [ + "bob._objects" + ], + "execution_count": 68, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{37172090972: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 68 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "kSNz-XtPE2a7", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = \"asdf\"" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "0O-WfyAIE2bB", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "45c9145a-5679-492d-e33d-146c1ebbb627" + }, + "source": [ + "bob._objects" + ], + "execution_count": 70, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{37172090972: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 70 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "_7tk5rnHE2bG", + "colab_type": "code", + "colab": {} + }, + "source": [ + "del x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "ZpQlQ8m7E2bO", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "c00d8416-01f0-4bac-fbbc-dcc5600d2b6c" + }, + "source": [ + "bob._objects" + ], + "execution_count": 72, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{37172090972: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 72 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ynLJVJKkE2bT", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "b7e4cf23-f53f-4766-f054-151a727429d8" + }, + "source": [ + "bob = bob.clear_objects()\n", + "bob._objects" + ], + "execution_count": 73, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 73 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "cp8XOg8DE2bY", + "colab_type": "code", + "colab": {} + }, + "source": [ + "for i in range(1000):\n", + " x = th.tensor([1,2,3,4,5]).send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "DpqZexxhE2bg", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "304ef03f-1770-4402-d53e-90ffa84e3073" + }, + "source": [ + "bob._objects" + ], + "execution_count": 75, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{78064462313: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 75 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "SBdwnN4HE2bj", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4,5]).send(bob)\n", + "y = th.tensor([1,1,1,1,1])" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "M-ZZq3sIE2bt", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 548 + }, + "outputId": "5e806ca9-6a10-49ef-e9cc-4115472ae62b" + }, + "source": [ + "z = x + y" + ], + "execution_count": 77, + "outputs": [ + { + "output_type": "error", + "ename": "TensorsNotCollocatedException", + "evalue": "ignored", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mPureTorchTensorFoundError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m/usr/local/lib/python3.6/dist-packages/syft/frameworks/torch/hook/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 668\u001b[0m new_self, new_args, new_kwargs = syft.frameworks.torch.hook_args.unwrap_args_from_method(\n\u001b[0;32m--> 669\u001b[0;31m \u001b[0mmethod_name\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 670\u001b[0m )\n", + "\u001b[0;32m/usr/local/lib/python3.6/dist-packages/syft/frameworks/torch/hook/hook_args.py\u001b[0m in \u001b[0;36munwrap_args_from_method\u001b[0;34m(attr, method_self, args, kwargs)\u001b[0m\n\u001b[1;32m 125\u001b[0m \u001b[0;31m# Try running it\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 126\u001b[0;31m \u001b[0mnew_self\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnew_args\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mhook_args\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmethod_self\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 127\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.6/dist-packages/syft/frameworks/torch/hook/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 356\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 357\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 358\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.6/dist-packages/syft/frameworks/torch/hook/hook_args.py\u001b[0m in \u001b[0;36mtwo_fold\u001b[0;34m(lambdas, args, **kwargs)\u001b[0m\n\u001b[1;32m 521\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mtwo_fold\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 522\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mlambdas\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlambdas\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 523\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.6/dist-packages/syft/frameworks/torch/hook/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 356\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 357\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 358\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.6/dist-packages/syft/frameworks/torch/hook/hook_args.py\u001b[0m in \u001b[0;36mtuple_one_fold\u001b[0;34m(lambdas, args)\u001b[0m\n\u001b[1;32m 515\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mtuple_one_fold\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 516\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 517\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.6/dist-packages/syft/frameworks/torch/hook/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(i)\u001b[0m\n\u001b[1;32m 334\u001b[0m \u001b[0;31m# Last if not, rule is probably == 1 so use type to return the right transformation.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 335\u001b[0;31m \u001b[0;32melse\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mforward_func\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 336\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mr\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mzip\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mrules\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# And do this for all the args / rules provided\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.6/dist-packages/syft/frameworks/torch/hook/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(i)\u001b[0m\n\u001b[1;32m 57\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"child\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 58\u001b[0;31m \u001b[0;32melse\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0m_\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0m_\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mthrow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mPureTorchTensorFoundError\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 59\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mParameter\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchild\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.6/dist-packages/syft/frameworks/torch/hook/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(.0)\u001b[0m\n\u001b[1;32m 57\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"child\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 58\u001b[0;31m \u001b[0;32melse\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0m_\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0m_\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mthrow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mPureTorchTensorFoundError\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 59\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mParameter\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchild\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mPureTorchTensorFoundError\u001b[0m: ", + "\nDuring handling of the above exception, another exception occurred:\n", + "\u001b[0;31mTensorsNotCollocatedException\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mz\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m/usr/local/lib/python3.6/dist-packages/syft/frameworks/torch/hook/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 671\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mBaseException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 672\u001b[0m \u001b[0;31m# we can make some errors more descriptive with this method\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 673\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mroute_method_exception\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 674\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 675\u001b[0m \u001b[0;31m# Send the new command to the appropriate class and get the response\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mTensorsNotCollocatedException\u001b[0m: You tried to call a method involving two tensors where one tensor is actually located on another machine (is a PointerTensor). Call .get() on the PointerTensor or .send(bob) on the other tensor.\n\nTensor A: [PointerTensor | me:1195382394 -> bob:83780156435]\nTensor B: tensor([1, 1, 1, 1, 1])" + ] + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "XJPf3K0ME2by", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4,5]).send(bob)\n", + "y = th.tensor([1,1,1,1,1]).send(alice)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "uv8nAIG5E2b2", + "colab_type": "code", + "colab": {}, + "outputId": "ec504cd3-a312-4539-f591-c7cf47567f5b" + }, + "source": [ + "z = x + y" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "error", + "ename": "TensorsNotCollocatedException", + "evalue": "You tried to call a method involving two tensors where one tensor is actually locatedon another machine (is a PointerTensor). Call .get() on the PointerTensor or .send(bob) on the other tensor.\n\nTensor A: [PointerTensor | me:46419059800 -> bob:14412738960]\nTensor B: tensor([1, 1, 1, 1, 1])", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mPureTorchTensorFoundError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 561\u001b[0m new_self, new_args = syft.frameworks.torch.hook_args.hook_method_args(\n\u001b[0;32m--> 562\u001b[0;31m \u001b[0mmethod_name\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 563\u001b[0m )\n", + "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36mhook_method_args\u001b[0;34m(attr, method_self, args)\u001b[0m\n\u001b[1;32m 85\u001b[0m \u001b[0;31m# Try running it\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 86\u001b[0;31m \u001b[0mnew_self\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnew_args\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mhook_args\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmethod_self\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 87\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 270\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 271\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 272\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36mtwo_fold\u001b[0;34m(lambdas, args)\u001b[0m\n\u001b[1;32m 420\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mtwo_fold\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 421\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mlambdas\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlambdas\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 422\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 270\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 271\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 272\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36mtuple_one_fold\u001b[0;34m(lambdas, args)\u001b[0m\n\u001b[1;32m 414\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mtuple_one_fold\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 415\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mlambdas\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 416\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(i)\u001b[0m\n\u001b[1;32m 248\u001b[0m \u001b[0;31m# Last if not, rule is probably == 1 so use type to return the right transformation.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 249\u001b[0;31m \u001b[0;32melse\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mforward_func\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 250\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mr\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mzip\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mrules\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# And do this for all the args / rules provided\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(i)\u001b[0m\n\u001b[1;32m 34\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"child\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 35\u001b[0;31m \u001b[0;32melse\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0m_\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0m_\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mthrow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mPureTorchTensorFoundError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 36\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mParameter\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchild\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook_args.py\u001b[0m in \u001b[0;36m\u001b[0;34m(.0)\u001b[0m\n\u001b[1;32m 34\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"child\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 35\u001b[0;31m \u001b[0;32melse\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0m_\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0m_\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mthrow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mPureTorchTensorFoundError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 36\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mParameter\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchild\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mPureTorchTensorFoundError\u001b[0m: tensor([1, 1, 1, 1, 1])", + "\nDuring handling of the above exception, another exception occurred:\n", + "\u001b[0;31mTensorsNotCollocatedException\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mz\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m/Users/atrask/anaconda/lib/python3.6/site-packages/syft-0.1.2a1-py3.6.egg/syft/frameworks/torch/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 564\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mBaseException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 565\u001b[0m \u001b[0;31m# we can make some errors more descriptive with this method\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 566\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mroute_method_exception\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 567\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 568\u001b[0m \u001b[0;31m# Send the new command to the appropriate class and get the response\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mTensorsNotCollocatedException\u001b[0m: You tried to call a method involving two tensors where one tensor is actually locatedon another machine (is a PointerTensor). Call .get() on the PointerTensor or .send(bob) on the other tensor.\n\nTensor A: [PointerTensor | me:46419059800 -> bob:14412738960]\nTensor B: tensor([1, 1, 1, 1, 1])" + ] + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DlxkHxwCE2cA", + "colab_type": "text" + }, + "source": [ + "# Lesson: Toy Federated Learning\n", + "\n", + "Let's start by training a toy model the centralized way. This is about a simple as models get. We first need:\n", + "\n", + "- a toy dataset\n", + "- a model\n", + "- some basic training logic for training a model to fit the data." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "MF2BQGKRE2cB", + "colab_type": "code", + "colab": {} + }, + "source": [ + "from torch import nn, optim" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "LvAQGYdrE2cD", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# A Toy Dataset\n", + "data = th.tensor([[1.,1],[0,1],[1,0],[0,0]], requires_grad=True)\n", + "target = th.tensor([[1.],[1], [0], [0]], requires_grad=True)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "LA40j-pZE2cG", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# A Toy Model\n", + "model = nn.Linear(2,1)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "ALzYcU8uE2cM", + "colab_type": "code", + "colab": {} + }, + "source": [ + "opt = optim.SGD(params=model.parameters(), lr=0.1)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "2PhhYsjSE2cO", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 397 + }, + "outputId": "7da6e6de-e5dd-402a-a018-e72b8dc7ab32" + }, + "source": [ + "def train(iterations=20):\n", + " for iter in range(iterations):\n", + " opt.zero_grad()\n", + "\n", + " pred = model(data)\n", + "\n", + " loss = ((pred - target)**2).sum()\n", + "\n", + " loss.backward()\n", + "\n", + " opt.step()\n", + "\n", + " print(loss.data)\n", + " \n", + "train()" + ], + "execution_count": 82, + "outputs": [ + { + "output_type": "stream", + "text": [ + "tensor(4.7627)\n", + "tensor(1.8633)\n", + "tensor(1.1185)\n", + "tensor(0.7219)\n", + "tensor(0.4714)\n", + "tensor(0.3092)\n", + "tensor(0.2036)\n", + "tensor(0.1347)\n", + "tensor(0.0896)\n", + "tensor(0.0599)\n", + "tensor(0.0403)\n", + "tensor(0.0273)\n", + "tensor(0.0186)\n", + "tensor(0.0128)\n", + "tensor(0.0089)\n", + "tensor(0.0062)\n", + "tensor(0.0044)\n", + "tensor(0.0031)\n", + "tensor(0.0022)\n", + "tensor(0.0016)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "paWODmkVE2cQ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "data_bob = data[0:2].send(bob)\n", + "target_bob = target[0:2].send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "r_9jdPXSE2cS", + "colab_type": "code", + "colab": {} + }, + "source": [ + "data_alice = data[2:4].send(alice)\n", + "target_alice = target[2:4].send(alice)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "KThxLd6ZE2cU", + "colab_type": "code", + "colab": {} + }, + "source": [ + "datasets = [(data_bob, target_bob), (data_alice, target_alice)]" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "KBWrset-E2cX", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def train(iterations=20):\n", + "\n", + " model = nn.Linear(2,1)\n", + " opt = optim.SGD(params=model.parameters(), lr=0.1)\n", + " \n", + " for iter in range(iterations):\n", + "\n", + " for _data, _target in datasets:\n", + "\n", + " # send model to the data\n", + " model = model.send(_data.location)\n", + "\n", + " # do normal training\n", + " opt.zero_grad()\n", + " pred = model(_data)\n", + " loss = ((pred - _target)**2).sum()\n", + " loss.backward()\n", + " opt.step()\n", + "\n", + " # get smarter model back\n", + " model = model.get()\n", + "\n", + " print(loss.get())" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "VGI1v54TE2cd", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 777 + }, + "outputId": "db771c69-ca2b-4971-8aa7-a79391c7fecb" + }, + "source": [ + "train()" + ], + "execution_count": 87, + "outputs": [ + { + "output_type": "stream", + "text": [ + "tensor(0.5232, requires_grad=True)\n", + "tensor(0.3466, requires_grad=True)\n", + "tensor(0.1689, requires_grad=True)\n", + "tensor(0.2104, requires_grad=True)\n", + "tensor(0.0933, requires_grad=True)\n", + "tensor(0.1226, requires_grad=True)\n", + "tensor(0.0537, requires_grad=True)\n", + "tensor(0.0713, requires_grad=True)\n", + "tensor(0.0311, requires_grad=True)\n", + "tensor(0.0415, requires_grad=True)\n", + "tensor(0.0180, requires_grad=True)\n", + "tensor(0.0242, requires_grad=True)\n", + "tensor(0.0104, requires_grad=True)\n", + "tensor(0.0141, requires_grad=True)\n", + "tensor(0.0060, requires_grad=True)\n", + "tensor(0.0082, requires_grad=True)\n", + "tensor(0.0035, requires_grad=True)\n", + "tensor(0.0048, requires_grad=True)\n", + "tensor(0.0020, requires_grad=True)\n", + "tensor(0.0028, requires_grad=True)\n", + "tensor(0.0012, requires_grad=True)\n", + "tensor(0.0016, requires_grad=True)\n", + "tensor(0.0007, requires_grad=True)\n", + "tensor(0.0010, requires_grad=True)\n", + "tensor(0.0004, requires_grad=True)\n", + "tensor(0.0006, requires_grad=True)\n", + "tensor(0.0002, requires_grad=True)\n", + "tensor(0.0003, requires_grad=True)\n", + "tensor(0.0001, requires_grad=True)\n", + "tensor(0.0002, requires_grad=True)\n", + "tensor(7.5990e-05, requires_grad=True)\n", + "tensor(0.0001, requires_grad=True)\n", + "tensor(4.4305e-05, requires_grad=True)\n", + "tensor(7.0868e-05, requires_grad=True)\n", + "tensor(2.5924e-05, requires_grad=True)\n", + "tensor(4.2572e-05, requires_grad=True)\n", + "tensor(1.5240e-05, requires_grad=True)\n", + "tensor(2.5731e-05, requires_grad=True)\n", + "tensor(9.0140e-06, requires_grad=True)\n", + "tensor(1.5661e-05, requires_grad=True)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BTV34juME2ci", + "colab_type": "text" + }, + "source": [ + "# Lesson: Advanced Remote Execution Tools\n", + "\n", + "In the last section we trained a toy model using Federated Learning. We did this by calling .send() and .get() on our model, sending it to the location of training data, updating it, and then bringing it back. However, at the end of the example we realized that we needed to go a bit further to protect people privacy. Namely, we want to average the gradients BEFORE calling .get(). That way, we won't ever see anyone's exact gradient (thus better protecting their privacy!!!)\n", + "\n", + "But, in order to do this, we need a few more pieces:\n", + "\n", + "- use a pointer to send a Tensor directly to another worker\n", + "\n", + "And in addition, while we're here, we're going to learn about a few more advanced tensor operations as well which will help us both with this example and a few in the future!" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "9pMe_todE2cj", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "eea190fc-a377-419c-8e8a-2f387f76050c" + }, + "source": [ + "bob.clear_objects()\n", + "alice.clear_objects()" + ], + "execution_count": 90, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 90 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "f1OPBPt-E2c2", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4,5]).send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "IWdkyQOSE2dN", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = x.send(alice)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "pZlq2po0E2dQ", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "2503f6ad-4479-473c-905d-b133a53edbc6" + }, + "source": [ + "bob._objects" + ], + "execution_count": 93, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{37859075752: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 93 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "DtZv95biE2dU", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "e3d648fa-dd6c-4714-b8fb-5f93e6bde696" + }, + "source": [ + "alice._objects" + ], + "execution_count": 94, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{58101277340: (Wrapper)>[PointerTensor | alice:58101277340 -> bob:37859075752]}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 94 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "KuquS-UVE2dW", + "colab_type": "code", + "colab": {} + }, + "source": [ + "y = x + x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "43-6afdRE2dY", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "d5091720-7666-4dec-c1c1-56aa517d4798" + }, + "source": [ + "y" + ], + "execution_count": 96, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:88129808966 -> alice:84122619342]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 96 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "i-xlC4OoE2da", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + }, + "outputId": "44eade0f-3397-46bc-e7e9-29a8d7d6fe22" + }, + "source": [ + "bob._objects" + ], + "execution_count": 97, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{24456453897: tensor([ 2, 4, 6, 8, 10]),\n", + " 37859075752: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 97 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "QU9qc1oxE2dc", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + }, + "outputId": "afe2caec-669a-4a9b-960c-105a94abf525" + }, + "source": [ + "alice._objects" + ], + "execution_count": 98, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{58101277340: (Wrapper)>[PointerTensor | alice:58101277340 -> bob:37859075752],\n", + " 84122619342: (Wrapper)>[PointerTensor | alice:84122619342 -> bob:24456453897]}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 98 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "etdXFw6NE2de", + "colab_type": "code", + "colab": {} + }, + "source": [ + "jon = sy.VirtualWorker(hook, id=\"jon\")" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "M-QUKzxlE2dg", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob.clear_objects()\n", + "alice.clear_objects()\n", + "\n", + "x = th.tensor([1,2,3,4,5]).send(bob).send(alice)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "1PvVs3KqE2dk", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "e5eeb992-acec-4d16-ecb7-8c9fee23ed6b" + }, + "source": [ + "bob._objects" + ], + "execution_count": 101, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{34176644655: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 101 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "bZ7dDzRPE2do", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "84af7203-3251-4943-e222-eb96c335c7ba" + }, + "source": [ + "alice._objects" + ], + "execution_count": 102, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{40148693556: (Wrapper)>[PointerTensor | alice:40148693556 -> bob:34176644655]}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 102 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "x1CnRTkeE2dq", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "a3a83228-bc0c-464f-9b8a-366aa7918d48" + }, + "source": [ + "x = x.get()\n", + "x" + ], + "execution_count": 103, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:40148693556 -> bob:34176644655]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 103 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "CzTdAzzEE2d0", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "d1eaedf1-0554-4867-d5e1-b90a850cbdef" + }, + "source": [ + "bob._objects" + ], + "execution_count": 104, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{34176644655: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 104 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "-McT4DncE2d3", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "5667938d-721e-41d1-aef9-2b4711b6a553" + }, + "source": [ + "alice._objects" + ], + "execution_count": 105, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 105 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "hqxHwLefE2d6", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "6fdedb0f-5301-473c-872d-018257bd8899" + }, + "source": [ + "x = x.get()\n", + "x" + ], + "execution_count": 106, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1, 2, 3, 4, 5])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 106 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "xfMxFTjdE2d9", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "09391885-6d6b-41b4-fdda-c421a4b6f356" + }, + "source": [ + "bob._objects" + ], + "execution_count": 107, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 107 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "cv59s5yUE2eC", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob.clear_objects()\n", + "alice.clear_objects()\n", + "\n", + "x = th.tensor([1,2,3,4,5]).send(bob).send(alice)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "QasIqL9OE2eG", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "90203533-bda5-433b-ce28-8b70fafbcbba" + }, + "source": [ + "bob._objects" + ], + "execution_count": 109, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{13352301478: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 109 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ZG-14y9eE2eL", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "65a8b805-0096-4231-c80e-ebb1451b7e3a" + }, + "source": [ + "alice._objects" + ], + "execution_count": 110, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{35572827053: (Wrapper)>[PointerTensor | alice:35572827053 -> bob:13352301478]}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 110 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "RM-zdOgJE2eN", + "colab_type": "code", + "colab": {} + }, + "source": [ + "del x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "I8eBVdS8E2eQ", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "bb87eab4-1735-49e6-aaa0-d13dd330217f" + }, + "source": [ + "bob._objects" + ], + "execution_count": 112, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 112 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "9sBSp3Z8E2eS", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "1aba4c33-731b-4c12-bb5d-b7c76a3d96a8" + }, + "source": [ + "alice._objects" + ], + "execution_count": 113, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 113 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HuoGhq_BE2ed", + "colab_type": "text" + }, + "source": [ + "# Lesson: Pointer Chain Operations" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "kr64_i4OE2ee", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "03061e3b-4adb-4ed7-e6c9-c6e12b097f58" + }, + "source": [ + "bob.clear_objects()\n", + "alice.clear_objects()" + ], + "execution_count": 114, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 114 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "sj76SaXqE2eh", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4,5]).send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "BqWrqheAE2ek", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "8ffe60b5-6f6f-4b28-dbe7-2ee6307ae6f8" + }, + "source": [ + "bob._objects" + ], + "execution_count": 116, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{68235688951: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 116 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "zf-bPRw1E2en", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "0e291183-9aba-4a94-edc8-c82396c3c436" + }, + "source": [ + "alice._objects" + ], + "execution_count": 117, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 117 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "vdrWN53kE2eu", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "6fb737a6-7408-4f7b-9408-5642cebbf89d" + }, + "source": [ + "x.move(alice)" + ], + "execution_count": 118, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:41477165916 -> alice:39296539549]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 118 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "LZBXgWy_E2e6", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "2c8a0d66-8a2a-4c61-8fbf-e8f2a7e01c6f" + }, + "source": [ + "bob._objects" + ], + "execution_count": 119, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 119 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "QfHEOsyHE2fD", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "508c1658-3874-4b70-a62b-e31bb99e34cc" + }, + "source": [ + "alice._objects" + ], + "execution_count": 120, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{39296539549: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 120 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "D6hg3ymUE2fN", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4,5]).send(bob).send(alice)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "gjwfcQ20E2fQ", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "7270bed6-c027-4ea8-a7e7-11b15e6f8f1b" + }, + "source": [ + "bob._objects" + ], + "execution_count": 122, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{50519296812: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 122 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "26y4Z34TE2fV", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + }, + "outputId": "2cfc3997-bfe9-4be9-fabb-4e327283f5e2" + }, + "source": [ + "alice._objects" + ], + "execution_count": 123, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{39296539549: tensor([1, 2, 3, 4, 5]),\n", + " 69575318925: (Wrapper)>[PointerTensor | alice:69575318925 -> bob:50519296812]}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 123 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "x3k7BqwDE2fZ", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "5175041a-e533-4fa1-b1ee-394d2d6a983a" + }, + "source": [ + "x.remote_get()" + ], + "execution_count": 124, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:76388321768 -> alice:69575318925]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 124 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "kJqVePqZE2fc", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "07381d62-387a-439f-a471-355401ab8df4" + }, + "source": [ + "bob._objects" + ], + "execution_count": 125, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 125 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "9hejiK9RE2ff", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "c2ce1f00-d097-4910-a4d8-c4975dd02b92" + }, + "source": [ + "alice._objects" + ], + "execution_count": 126, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{39296539549: tensor([1, 2, 3, 4, 5]), 69575318925: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 126 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "DWyYsW6ME2fk", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "2bc31dea-2729-48f5-955f-5d7142253d01" + }, + "source": [ + "x.move(bob)" + ], + "execution_count": 127, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:75207849564 -> bob:76388321768]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 127 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "XyetkEa9E2fm", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "98a865f3-5385-41d8-9fb8-20b358da5519" + }, + "source": [ + "x" + ], + "execution_count": 128, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[PointerTensor | me:75207849564 -> bob:76388321768]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 128 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "5bGj8WToE2fo", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "2cfc4407-e683-4119-e5be-50ff73573454" + }, + "source": [ + "bob._objects" + ], + "execution_count": 129, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{76388321768: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 129 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "vGQK7LpPE2fq", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "outputId": "7d3876c3-89ad-48a8-8bc3-66b9c690f494" + }, + "source": [ + "alice._objects" + ], + "execution_count": 130, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{39296539549: tensor([1, 2, 3, 4, 5])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 130 + } + ] + } + ] +} \ No newline at end of file From b51fbeed32a33dce2263d9fbae0b1ed20ecaacfe Mon Sep 17 00:00:00 2001 From: Nimanshi Jha <40792134+nimanshijha@users.noreply.github.com> Date: Wed, 21 Aug 2019 18:21:45 +0530 Subject: [PATCH 07/14] Delete Section 3 - Securing Federated Learning.ipynb --- Section 3 - Securing Federated Learning.ipynb | 767 ------------------ 1 file changed, 767 deletions(-) delete mode 100644 Section 3 - Securing Federated Learning.ipynb diff --git a/Section 3 - Securing Federated Learning.ipynb b/Section 3 - Securing Federated Learning.ipynb deleted file mode 100644 index 4ed3906..0000000 --- a/Section 3 - Securing Federated Learning.ipynb +++ /dev/null @@ -1,767 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Section: Securing Federated Learning\n", - "\n", - "- Lesson 1: Trusted Aggregator\n", - "- Lesson 2: Intro to Additive Secret Sharing\n", - "- Lesson 3: Intro to Fixed Precision Encoding\n", - "- Lesson 4: Secret Sharing + Fixed Precision in PySyft\n", - "- Final Project: Federated Learning wtih Encrypted Gradient Aggregation" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Federated Learning with a Trusted Aggregator\n", - "\n", - "In the last section, we learned how to train a model on a distributed dataset using Federated Learning. In particular, the last project aggregated gradients directly from one data owner to another. \n", - "\n", - "However, while in some cases it could be ideal to do this, what would be even better is to be able to choose a neutral third party to perform the aggregation.\n", - "\n", - "As it turns out, we can use the same tools we used previously to accomplish this." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project: Federated Learning with a Trusted Aggregator" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "# try this project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Intro to Additive Secret Sharing\n", - "\n", - "While being able to have a trusted third party to perform the aggregation is certainly nice, in an ideal setting we wouldn't have to trust anyone at all. This is where Cryptography can provide an interesting alterantive. \n", - "\n", - "Specifically, we're going to be looking at a simple protocol for Secure Multi-Party Computation called Additive Secret Sharing. This protocol will allow multiple parties (of size 3 or more) to aggregate their gradients without the use of a trusted 3rd party to perform the aggregation. In other words, we can add 3 numbers together from 3 different people without anyone ever learning the inputs of any other actors.\n", - "\n", - "Let's start by considering the number 5, which we'll put into a varible x" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [], - "source": [ - "x = 5" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's say we wanted to SHARE the ownership of this number between two people, Alice and Bob. We could split this number into two shares, 2, and 3, and give one to Alice and one to Bob" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "5" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob_x_share = 2\n", - "alice_x_share = 3\n", - "\n", - "decrypted_x = bob_x_share + alice_x_share\n", - "decrypted_x" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Note that neither Bob nor Alice know the value of x. They only know the value of their own SHARE of x. Thus, the true value of X is hidden (i.e., encrypted). \n", - "\n", - "The truly amazing thing, however, is that Alice and Bob can still compute using this value! They can perform arithmetic over the hidden value! Let's say Bob and Alice wanted to multiply this value by 2! If each of them multiplied their respective share by 2, then the hidden number between them is also multiplied! Check it out!" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "10" - ] - }, - "execution_count": 15, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob_x_share = 2 * 2\n", - "alice_x_share = 3 * 2\n", - "\n", - "decrypted_x = bob_x_share + alice_x_share\n", - "decrypted_x" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This even works for addition between two shared values!!" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "12" - ] - }, - "execution_count": 16, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# encrypted \"5\"\n", - "bob_x_share = 2\n", - "alice_x_share = 3\n", - "\n", - "# encrypted \"7\"\n", - "bob_y_share = 5\n", - "alice_y_share = 2\n", - "\n", - "# encrypted 5 + 7\n", - "bob_z_share = bob_x_share + bob_y_share\n", - "alice_z_share = alice_x_share + alice_y_share\n", - "\n", - "decrypted_z = bob_z_share + alice_z_share\n", - "decrypted_z" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As you can see, we just added two numbers together while they were still encrypted!!!\n", - "\n", - "One small tweak - notice that since all our numbers are positive, it's possible for each share to reveal a little bit of information about the hidden value, namely, it's always greater than the share. Thus, if Bob has a share \"3\" then he knows that the encrypted value is at least 3.\n", - "\n", - "This would be quite bad, but can be solved through a simple fix. Decryption happens by summing all the shares together MODULUS some constant. I.e." - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "23740629843736686616461" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x = 5\n", - "\n", - "Q = 23740629843760239486723\n", - "\n", - "bob_x_share = 23552870267 # <- a random number\n", - "alice_x_share = Q - bob_x_share + x\n", - "alice_x_share" - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "5" - ] - }, - "execution_count": 18, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "(bob_x_share + alice_x_share) % Q" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "So now, as you can see, both shares are wildly larger than the number being shared, meaning that individual shares no longer leak this inforation. However, all the properties we discussed earlier still hold! (addition, encryption, decryption, etc.)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project: Build Methods for Encrypt, Decrypt, and Add \n", - "\n", - "In this project, you must take the lessons we learned in the last section and write general methods for encrypt, decrypt, and add. Store shares for a variable in a tuple like so." - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": {}, - "outputs": [], - "source": [ - "x_share = (2,5,7)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Even though normally those shares would be distributed amongst several workers, you can store them in ordered tuples like this for now :)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# try this project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Intro to Fixed Precision Encoding\n", - "\n", - "As you may remember, our goal is to aggregate gradients using this new Secret Sharing technique. However, the protocol we've just explored in the last section uses positive integers. However, our neural network weights are NOT integers. Instead, our weights are decimals (floating point numbers).\n", - "\n", - "Not a huge deal! We just need to use a fixed precision encoding, which lets us do computation over decimal numbers using integers!" - ] - }, - { - "cell_type": "code", - "execution_count": 25, - "metadata": {}, - "outputs": [], - "source": [ - "BASE=10\n", - "PRECISION=4" - ] - }, - { - "cell_type": "code", - "execution_count": 26, - "metadata": {}, - "outputs": [], - "source": [ - "def encode(x):\n", - " return int((x * (BASE ** PRECISION)) % Q)\n", - "\n", - "def decode(x):\n", - " return (x if x <= Q/2 else x - Q) / BASE**PRECISION" - ] - }, - { - "cell_type": "code", - "execution_count": 27, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "35000" - ] - }, - "execution_count": 27, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "encode(3.5)" - ] - }, - { - "cell_type": "code", - "execution_count": 28, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "3.5" - ] - }, - "execution_count": 28, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "decode(35000)" - ] - }, - { - "cell_type": "code", - "execution_count": 29, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "7.8" - ] - }, - "execution_count": 29, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x = encrypt(encode(5.5))\n", - "y = encrypt(encode(2.3))\n", - "z = add(x,y)\n", - "decode(decrypt(z))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Secret Sharing + Fixed Precision in PySyft\n", - "\n", - "While writing things from scratch is certainly educational, PySyft makes a great deal of this much easier for us through its abstractions." - ] - }, - { - "cell_type": "code", - "execution_count": 30, - "metadata": {}, - "outputs": [], - "source": [ - "bob = bob.clear_objects()\n", - "alice = alice.clear_objects()\n", - "secure_worker = secure_worker.clear_objects()" - ] - }, - { - "cell_type": "code", - "execution_count": 31, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4,5])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Secret Sharing Using PySyft\n", - "\n", - "We can share using the simple .share() method!" - ] - }, - { - "cell_type": "code", - "execution_count": 32, - "metadata": {}, - "outputs": [], - "source": [ - "x = x.share(bob, alice, secure_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 33, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{35498656553: tensor([ 10235770278698899, 1401398179551373756, 2277280072169145491,\n", - " 636965538565031298, 913795591610271305])}" - ] - }, - "execution_count": 33, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob._objects" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "and as you can see, Bob now has one of the shares of x! Furthermore, we can still call addition in this state, and PySyft will automatically perform the remote execution for us!" - ] - }, - { - "cell_type": "code", - "execution_count": 34, - "metadata": {}, - "outputs": [], - "source": [ - "y = x + x" - ] - }, - { - "cell_type": "code", - "execution_count": 35, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>[AdditiveSharingTensor]\n", - "\t-> (Wrapper)>[PointerTensor | me:23637986557 -> bob:30254176063]\n", - "\t-> (Wrapper)>[PointerTensor | me:18229131498 -> alice:75856222543]\n", - "\t-> (Wrapper)>[PointerTensor | me:34301722959 -> secure_worker:75419815101]\n", - "\t*crypto provider: me*" - ] - }, - "execution_count": 35, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "y" - ] - }, - { - "cell_type": "code", - "execution_count": 36, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([ 2, 4, 6, 8, 10])" - ] - }, - "execution_count": 36, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "y.get()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Fixed Precision using PySyft\n", - "\n", - "We can also convert a tensor to fixed precision using .fix_precision()" - ] - }, - { - "cell_type": "code", - "execution_count": 37, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([0.1,0.2,0.3])" - ] - }, - { - "cell_type": "code", - "execution_count": 38, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([0.1000, 0.2000, 0.3000])" - ] - }, - "execution_count": 38, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 39, - "metadata": {}, - "outputs": [], - "source": [ - "x = x.fix_prec()" - ] - }, - { - "cell_type": "code", - "execution_count": 40, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([100, 200, 300])" - ] - }, - "execution_count": 40, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x.child.child" - ] - }, - { - "cell_type": "code", - "execution_count": 41, - "metadata": {}, - "outputs": [], - "source": [ - "y = x + x" - ] - }, - { - "cell_type": "code", - "execution_count": 42, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([0.2000, 0.4000, 0.6000])" - ] - }, - "execution_count": 42, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "y = y.float_prec()\n", - "y" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Shared Fixed Precision\n", - "\n", - "And of course, we can combine the two!" - ] - }, - { - "cell_type": "code", - "execution_count": 43, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([0.1, 0.2, 0.3])" - ] - }, - { - "cell_type": "code", - "execution_count": 44, - "metadata": {}, - "outputs": [], - "source": [ - "x = x.fix_prec().share(bob, alice, secure_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 45, - "metadata": {}, - "outputs": [], - "source": [ - "y = x + x" - ] - }, - { - "cell_type": "code", - "execution_count": 46, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([0.2000, 0.4000, 0.6000])" - ] - }, - "execution_count": 46, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "y.get().float_prec()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Make sure to make the point that people can see the model averages in the clear." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Final Project: Federated Learning with Encrypted Gradient Aggregation" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.1" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} From 655a025ba7bb9cefeccaf866b29cc5336c1fab1b Mon Sep 17 00:00:00 2001 From: Nimanshi Jha <40792134+nimanshijha@users.noreply.github.com> Date: Sat, 24 Aug 2019 19:40:05 +0530 Subject: [PATCH 08/14] Add files via upload --- ..._Securing_Federated_Learning_(1)_(1).ipynb | 1869 +++++++++++++++++ 1 file changed, 1869 insertions(+) create mode 100644 Section_3_Securing_Federated_Learning_(1)_(1).ipynb diff --git a/Section_3_Securing_Federated_Learning_(1)_(1).ipynb b/Section_3_Securing_Federated_Learning_(1)_(1).ipynb new file mode 100644 index 0000000..a6ea756 --- /dev/null +++ b/Section_3_Securing_Federated_Learning_(1)_(1).ipynb @@ -0,0 +1,1869 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.1" + }, + "colab": { + "name": "Section_3_Securing_Federated_Learning_(1) (1).ipynb", + "version": "0.3.2", + "provenance": [] + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "Bns7FVdBPECL", + "colab_type": "text" + }, + "source": [ + "# Section: Securing Federated Learning\n", + "\n", + "- Lesson 1: Trusted Aggregator\n", + "- Lesson 2: Intro to Additive Secret Sharing\n", + "- Lesson 3: Intro to Fixed Precision Encoding\n", + "- Lesson 4: Secret Sharing + Fixed Precision in PySyft\n", + "- Final Project: Federated Learning wtih Encrypted Gradient Aggregation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "khh9Rcq2PECN", + "colab_type": "text" + }, + "source": [ + "# Lesson: Federated Learning with a Trusted Aggregator\n", + "\n", + "In the last section, we learned how to train a model on a distributed dataset using Federated Learning. In particular, the last project aggregated gradients directly from one data owner to another. \n", + "\n", + "However, while in some cases it could be ideal to do this, what would be even better is to be able to choose a neutral third party to perform the aggregation.\n", + "\n", + "As it turns out, we can use the same tools we used previously to accomplish this." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "A2mriQAHPECP", + "colab_type": "text" + }, + "source": [ + "# Project: Federated Learning with a Trusted Aggregator" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "5zUGlg_KvSMH", + "colab_type": "code", + "outputId": "2c43f713-9aec-4974-d4d6-fde16558abbc", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 873 + } + }, + "source": [ + "pip install syft" + ], + "execution_count": 4, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Requirement already satisfied: syft in /usr/local/lib/python3.6/dist-packages (0.1.24a1)\n", + "Requirement already satisfied: flask-socketio>=3.3.2 in /usr/local/lib/python3.6/dist-packages (from syft) (4.2.1)\n", + "Requirement already satisfied: tblib>=1.4.0 in /usr/local/lib/python3.6/dist-packages (from syft) (1.4.0)\n", + "Requirement already satisfied: websockets>=7.0 in /usr/local/lib/python3.6/dist-packages (from syft) (8.0.2)\n", + "Requirement already satisfied: lz4>=2.1.6 in /usr/local/lib/python3.6/dist-packages (from syft) (2.1.10)\n", + "Requirement already satisfied: msgpack>=0.6.1 in /usr/local/lib/python3.6/dist-packages (from syft) (0.6.1)\n", + "Requirement already satisfied: torchvision==0.3.0 in /usr/local/lib/python3.6/dist-packages (from syft) (0.3.0)\n", + "Requirement already satisfied: zstd>=1.4.0.0 in /usr/local/lib/python3.6/dist-packages (from syft) (1.4.1.0)\n", + "Requirement already satisfied: torch==1.1 in /usr/local/lib/python3.6/dist-packages (from syft) (1.1.0)\n", + "Requirement already satisfied: tf-encrypted!=0.5.7,>=0.5.4 in /usr/local/lib/python3.6/dist-packages (from syft) (0.5.8)\n", + "Requirement already satisfied: numpy>=1.14.0 in /usr/local/lib/python3.6/dist-packages (from syft) (1.16.4)\n", + "Requirement already satisfied: websocket-client>=0.56.0 in /usr/local/lib/python3.6/dist-packages (from syft) (0.56.0)\n", + "Requirement already satisfied: scikit-learn>=0.21.0 in /usr/local/lib/python3.6/dist-packages (from syft) (0.21.3)\n", + "Requirement already satisfied: Flask>=1.0.2 in /usr/local/lib/python3.6/dist-packages (from syft) (1.1.1)\n", + "Requirement already satisfied: python-socketio>=4.3.0 in /usr/local/lib/python3.6/dist-packages (from flask-socketio>=3.3.2->syft) (4.3.1)\n", + "Requirement already satisfied: pillow>=4.1.1 in /usr/local/lib/python3.6/dist-packages (from torchvision==0.3.0->syft) (4.3.0)\n", + "Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from torchvision==0.3.0->syft) (1.12.0)\n", + "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.6/dist-packages (from tf-encrypted!=0.5.7,>=0.5.4->syft) (5.1.2)\n", + "Requirement already satisfied: tensorflow<2,>=1.12.0 in /usr/local/lib/python3.6/dist-packages (from tf-encrypted!=0.5.7,>=0.5.4->syft) (1.14.0)\n", + "Requirement already satisfied: scipy>=0.17.0 in /usr/local/lib/python3.6/dist-packages (from scikit-learn>=0.21.0->syft) (1.3.1)\n", + "Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.6/dist-packages (from scikit-learn>=0.21.0->syft) (0.13.2)\n", + "Requirement already satisfied: itsdangerous>=0.24 in /usr/local/lib/python3.6/dist-packages (from Flask>=1.0.2->syft) (1.1.0)\n", + "Requirement already satisfied: Jinja2>=2.10.1 in /usr/local/lib/python3.6/dist-packages (from Flask>=1.0.2->syft) (2.10.1)\n", + "Requirement already satisfied: click>=5.1 in /usr/local/lib/python3.6/dist-packages (from Flask>=1.0.2->syft) (7.0)\n", + "Requirement already satisfied: Werkzeug>=0.15 in /usr/local/lib/python3.6/dist-packages (from Flask>=1.0.2->syft) (0.15.5)\n", + "Requirement already satisfied: python-engineio>=3.9.0 in /usr/local/lib/python3.6/dist-packages (from python-socketio>=4.3.0->flask-socketio>=3.3.2->syft) (3.9.3)\n", + "Requirement already satisfied: olefile in /usr/local/lib/python3.6/dist-packages (from pillow>=4.1.1->torchvision==0.3.0->syft) (0.46)\n", + "Requirement already satisfied: tensorboard<1.15.0,>=1.14.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.14.0)\n", + "Requirement already satisfied: keras-applications>=1.0.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.0.8)\n", + "Requirement already satisfied: google-pasta>=0.1.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.1.7)\n", + "Requirement already satisfied: tensorflow-estimator<1.15.0rc0,>=1.14.0rc0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.14.0)\n", + "Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.1.0)\n", + "Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.1.0)\n", + "Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.33.4)\n", + "Requirement already satisfied: absl-py>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.7.1)\n", + "Requirement already satisfied: astor>=0.6.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.8.0)\n", + "Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.15.0)\n", + "Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (3.7.1)\n", + "Requirement already satisfied: wrapt>=1.11.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.11.2)\n", + "Requirement already satisfied: gast>=0.2.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.2.2)\n", + "Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.6/dist-packages (from Jinja2>=2.10.1->Flask>=1.0.2->syft) (1.1.1)\n", + "Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (41.2.0)\n", + "Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (3.1.1)\n", + "Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from keras-applications>=1.0.6->tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (2.8.0)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "T-RxOTPWPECX", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import syft as sy\n", + "import torch as th\n", + "hook = sy.TorchHook(th)\n", + "from torch import nn, optim" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "yKuWVjEhPECg", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob = sy.VirtualWorker(hook, id=\"bob\")\n", + "alice = sy.VirtualWorker(hook, id=\"alice\")\n", + "secure_worker = sy.VirtualWorker(hook, id=\"secure_worker\")" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "9sxy3sIjPECz", + "colab_type": "code", + "outputId": "3ac7f169-5c2b-4ad2-e4a1-81d37f52e3a9", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 150 + } + }, + "source": [ + "bob.add_workers([alice, secure_worker])\n", + "alice.add_workers([bob, secure_worker])\n", + "secure_worker.add_workers([alice, bob])" + ], + "execution_count": 7, + "outputs": [ + { + "output_type": "stream", + "text": [ + "W0824 14:07:27.478001 139786501089152 base.py:654] Worker alice already exists. Replacing old worker which could cause unexpected behavior\n", + "W0824 14:07:27.479735 139786501089152 base.py:654] Worker secure_worker already exists. Replacing old worker which could cause unexpected behavior\n", + "W0824 14:07:27.481107 139786501089152 base.py:654] Worker bob already exists. Replacing old worker which could cause unexpected behavior\n", + "W0824 14:07:27.482034 139786501089152 base.py:654] Worker secure_worker already exists. Replacing old worker which could cause unexpected behavior\n", + "W0824 14:07:27.483205 139786501089152 base.py:654] Worker alice already exists. Replacing old worker which could cause unexpected behavior\n", + "W0824 14:07:27.485067 139786501089152 base.py:654] Worker bob already exists. Replacing old worker which could cause unexpected behavior\n" + ], + "name": "stderr" + }, + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 7 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "2vuF7J3uPEC9", + "colab_type": "code", + "colab": {} + }, + "source": [ + "data = th.tensor([[0,0],[0,1],[1,0],[1,1.]] ,requires_grad=False)\n", + "target = th.tensor([[0],[0],[1],[1.]] , requires_grad=False)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "byA57WMMPEDJ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bobs_data = data[0:2].send(bob)\n", + "bobs_target = target[0:2].send(bob)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "MsJrqjGFPEDQ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "alices_data = data[2:].send(alice)\n", + "alices_target = target[2:].send(alice)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "vWhy7XESy7Di", + "colab_type": "code", + "colab": {} + }, + "source": [ + "model = nn.Linear(2,1)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "QTM1qHUty7Aj", + "colab_type": "code", + "outputId": "b9e03a1f-ceaa-4014-f41d-3742dc393c63", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 207 + } + }, + "source": [ + "for round_iter in range(10) : \n", + " \n", + " bobs_model = model.copy().send(bob)\n", + " alices_model = model.copy().send(alice)\n", + "\n", + " bobs_opt = optim.SGD(params=bobs_model.parameters(), lr=0.1)\n", + " alices_opt = optim.SGD(params=alices_model.parameters(), lr=0.1)\n", + "\n", + " for i in range(10) :\n", + "\n", + " bobs_opt.zero_grad()\n", + " bobs_pred = bobs_model(bobs_data)\n", + " bobs_loss = ((bobs_pred - bobs_target) **2).sum()\n", + " bobs_loss.backward()\n", + "\n", + " bobs_opt.step()\n", + " bobs_loss = bobs_loss.get().data\n", + " bobs_loss\n", + " \n", + " alices_opt.zero_grad()\n", + " alices_pred = alices_model(alices_data)\n", + " alices_loss = ((alices_pred - alices_target) **2).sum()\n", + " alices_loss.backward()\n", + "\n", + " alices_opt.step()\n", + " alices_loss = alices_loss.get().data\n", + " alices_loss\n", + " \n", + " \n", + " \n", + " alices_model.move(secure_worker)\n", + " bobs_model.move(secure_worker) \n", + " \n", + " \n", + " \n", + " secure_worker.clear_objects()\n", + " \n", + " print(\"Bob:\" + str(bobs_loss) + \" Alice:\" + str(alices_loss))" + ], + "execution_count": 12, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Bob:tensor(1.3614e-05) Alice:tensor(0.0131)\n", + "Bob:tensor(1.3614e-05) Alice:tensor(0.0131)\n", + "Bob:tensor(1.3614e-05) Alice:tensor(0.0131)\n", + "Bob:tensor(1.3614e-05) Alice:tensor(0.0131)\n", + "Bob:tensor(1.3614e-05) Alice:tensor(0.0131)\n", + "Bob:tensor(1.3614e-05) Alice:tensor(0.0131)\n", + "Bob:tensor(1.3614e-05) Alice:tensor(0.0131)\n", + "Bob:tensor(1.3614e-05) Alice:tensor(0.0131)\n", + "Bob:tensor(1.3614e-05) Alice:tensor(0.0131)\n", + "Bob:tensor(1.3614e-05) Alice:tensor(0.0131)\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "j73tODLAPEDT", + "colab_type": "text" + }, + "source": [ + "# Lesson: Intro to Additive Secret Sharing\n", + "\n", + "While being able to have a trusted third party to perform the aggregation is certainly nice, in an ideal setting we wouldn't have to trust anyone at all. This is where Cryptography can provide an interesting alterantive. \n", + "\n", + "Specifically, we're going to be looking at a simple protocol for Secure Multi-Party Computation called Additive Secret Sharing. This protocol will allow multiple parties (of size 3 or more) to aggregate their gradients without the use of a trusted 3rd party to perform the aggregation. In other words, we can add 3 numbers together from 3 different people without anyone ever learning the inputs of any other actors.\n", + "\n", + "Let's start by considering the number 5, which we'll put into a varible x" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "7qcfy8VIPEDc", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = 5" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "a_gYa2zqPEDg", + "colab_type": "text" + }, + "source": [ + "Let's say we wanted to SHARE the ownership of this number between two people, Alice and Bob. We could split this number into two shares, 2, and 3, and give one to Alice and one to Bob" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "8NSqUqqMPEDi", + "colab_type": "code", + "outputId": "b3a00dbe-0345-4101-9b06-b4ee1b36e818", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "bob_x_share = 2\n", + "alice_x_share = 3\n", + "\n", + "decrypted_x = bob_x_share + alice_x_share\n", + "decrypted_x" + ], + "execution_count": 14, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "5" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 14 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yoL9-YqYPEDx", + "colab_type": "text" + }, + "source": [ + "Note that neither Bob nor Alice know the value of x. They only know the value of their own SHARE of x. Thus, the true value of X is hidden (i.e., encrypted). \n", + "\n", + "The truly amazing thing, however, is that Alice and Bob can still compute using this value! They can perform arithmetic over the hidden value! Let's say Bob and Alice wanted to multiply this value by 2! If each of them multiplied their respective share by 2, then the hidden number between them is also multiplied! Check it out!" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "HWikFfJuPEDz", + "colab_type": "code", + "outputId": "853d1b31-fae8-45d4-fb72-6e314e9d7952", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "bob_x_share = 2 * 2\n", + "alice_x_share = 3 * 2\n", + "\n", + "decrypted_x = bob_x_share + alice_x_share\n", + "decrypted_x" + ], + "execution_count": 15, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "10" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 15 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "i0ki4sfOPEEC", + "colab_type": "text" + }, + "source": [ + "This even works for addition between two shared values!!" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "YzkCfTRzPEEE", + "colab_type": "code", + "outputId": "451c55d5-9bc9-447a-d8d2-64fb212d5e01", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "# encrypted \"5\"\n", + "bob_x_share = 2\n", + "alice_x_share = 3\n", + "\n", + "# encrypted \"7\"\n", + "bob_y_share = 5\n", + "alice_y_share = 2\n", + "\n", + "# encrypted 5 + 7\n", + "bob_z_share = bob_x_share + bob_y_share\n", + "alice_z_share = alice_x_share + alice_y_share\n", + "\n", + "decrypted_z = bob_z_share + alice_z_share\n", + "decrypted_z" + ], + "execution_count": 16, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "12" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 16 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "rJzJvmPdPEEP", + "colab_type": "text" + }, + "source": [ + "As you can see, we just added two numbers together while they were still encrypted!!!\n", + "\n", + "One small tweak - notice that since all our numbers are positive, it's possible for each share to reveal a little bit of information about the hidden value, namely, it's always greater than the share. Thus, if Bob has a share \"3\" then he knows that the encrypted value is at least 3.\n", + "\n", + "This would be quite bad, but can be solved through a simple fix. Decryption happens by summing all the shares together MODULUS some constant. I.e." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ge1RpRD2PEER", + "colab_type": "code", + "outputId": "18ee4186-5257-4b63-c734-23a70b81c07e", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "x = 5\n", + "\n", + "Q = 23740629843760239486723\n", + "\n", + "bob_x_share = 23552870267 # <- a random number\n", + "alice_x_share = Q - bob_x_share + x\n", + "alice_x_share" + ], + "execution_count": 17, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "23740629843736686616461" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 17 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "qd9tXlpSPEEi", + "colab_type": "code", + "outputId": "b9998a28-770b-40cc-9108-65bac2260fe7", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + } + }, + "source": [ + "(bob_x_share + alice_x_share) % Q" + ], + "execution_count": 18, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "5" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 18 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4KyTB7-CPEEl", + "colab_type": "text" + }, + "source": [ + "So now, as you can see, both shares are wildly larger than the number being shared, meaning that individual shares no longer leak this inforation. However, all the properties we discussed earlier still hold! (addition, encryption, decryption, etc.)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "XlIpOGZJPEEn", + "colab_type": "text" + }, + "source": [ + "# Project: Build Methods for Encrypt, Decrypt, and Add \n", + "\n", + "In this project, you must take the lessons we learned in the last section and write general methods for encrypt, decrypt, and add. Store shares for a variable in a tuple like so." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "A4V2_GxSPEEq", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x_share = (2,5,7)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jvbILJkRPEEw", + "colab_type": "text" + }, + "source": [ + "Even though normally those shares would be distributed amongst several workers, you can store them in ordered tuples like this for now :)" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "79Ha9uGCPEEx", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import random" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "AWBxHZR_PEE0", + "colab_type": "code", + "colab": {} + }, + "source": [ + "Q=23740629843760239486723" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "nKmD-BTFPEE3", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x= 5" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "G99szq-_PEE-", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def encrypt(x, n_shares=3) :\n", + " \n", + " shares = list()\n", + " \n", + " for i in range(n_shares - 1) :\n", + " shares.append(random.randint(0,Q))\n", + " \n", + " final_share = Q - (sum(shares) % Q) + x\n", + " \n", + " shares.append(final_share)\n", + " \n", + " return tuple(shares)\n", + " " + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "PTmegPWDPEFD", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def decrypt(shares):\n", + " return sum(shares) % Q" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "ialWJFevfyrY", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def add(a, b) :\n", + " \n", + " c= list()\n", + " \n", + " assert(len(a) == len(b))\n", + " \n", + " for i in range(len(a)) :\n", + " c.append((a[i] + b[i]) % Q)\n", + " \n", + " return tuple(c) " + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "hveBEIe-fyfZ", + "colab_type": "code", + "outputId": "200c7f12-bf58-4d22-dbd2-e5dab00c424b", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "decrypt(add(encrypt(5), encrypt(10)))" + ], + "execution_count": 26, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "15" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 26 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "KK3MUF-3fyJf", + "colab_type": "code", + "outputId": "fc7bec51-18c8-40d3-ff81-39df3a3f47ea", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "x = encrypt(5)\n", + "x" + ], + "execution_count": 27, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(8184938992920865048999, 3632988170551367062397, 11922702680288007375332)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 27 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ygMlwv-LgxHX", + "colab_type": "code", + "outputId": "348f70ed-1f5e-44f1-d6d7-9f18ad0f5a95", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "y = encrypt(10)\n", + "y" + ], + "execution_count": 28, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(2626642703191711123736, 15602103524767753345206, 5511883615800775017791)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 28 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "IhNsQVTxgw7V", + "colab_type": "code", + "outputId": "e8e09062-5fbf-4d1b-f166-0779a9b6f292", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "z = add(x,y)\n", + "z" + ], + "execution_count": 29, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(10811581696112576172735, 19235091695319120407603, 17434586296088782393123)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 29 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "m4oopdvIg-zg", + "colab_type": "code", + "outputId": "d3d4d45a-bccd-4880-dba3-93d6e93c358d", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "decrypt(z)" + ], + "execution_count": 30, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "15" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 30 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Dtn6BHm2PEFV", + "colab_type": "text" + }, + "source": [ + "# Lesson: Intro to Fixed Precision Encoding\n", + "\n", + "As you may remember, our goal is to aggregate gradients using this new Secret Sharing technique. However, the protocol we've just explored in the last section uses positive integers. However, our neural network weights are NOT integers. Instead, our weights are decimals (floating point numbers).\n", + "\n", + "Not a huge deal! We just need to use a fixed precision encoding, which lets us do computation over decimal numbers using integers!" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "FFY64FnjPEFZ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "BASE=10\n", + "PRECISION=4\n", + "Q=23740629843760239486723" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "8LDGxRTLPEFi", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def encode(x_dec):\n", + " return int((x_dec * (BASE ** PRECISION)) % Q)\n", + "\n", + "def decode(x_fp):\n", + " return (x_fp if x_fp <= Q/2 else x_fp - Q) / BASE**PRECISION" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "VQJmwTHePEFl", + "colab_type": "code", + "outputId": "a19e1aaa-b5ed-481d-a2cb-fd167d55f711", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "encode(0.5)" + ], + "execution_count": 33, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "5000" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 33 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "_c33CUqHPEFo", + "colab_type": "code", + "outputId": "b6dc639a-13b0-4a1c-88c3-0c55a26510ec", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "decode(23740629843760239486723)" + ], + "execution_count": 34, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "0.0" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 34 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "74yhFJ7GPEFu", + "colab_type": "code", + "outputId": "33c04150-dc9e-4830-ca49-7ce06db77937", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "x = encrypt(encode(5.5))\n", + "y = encrypt(encode(2.3))\n", + "z = add(x,y)\n", + "decode(decrypt(z))" + ], + "execution_count": 35, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "7.8" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 35 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "axHtQHqwPEF1", + "colab_type": "text" + }, + "source": [ + "# Lesson: Secret Sharing + Fixed Precision in PySyft\n", + "\n", + "While writing things from scratch is certainly educational, PySyft makes a great deal of this much easier for us through its abstractions." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "WoaZpAFBPEF2", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob = bob.clear_objects()\n", + "alice = alice.clear_objects()\n", + "secure_worker = secure_worker.clear_objects()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "-fnYnJbYPEF4", + "colab_type": "code", + "outputId": "44ec4b80-15a4-4509-93c9-642e90c54bd6", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "x = th.tensor([1,2,3,4,5])\n", + "x" + ], + "execution_count": 37, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([1, 2, 3, 4, 5])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 37 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BJ2B0EQsPEF9", + "colab_type": "text" + }, + "source": [ + "### Secret Sharing Using PySyft\n", + "\n", + "We can share using the simple .share() method!" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "7Y6bW1MZPEF_", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = x.share(bob, alice, secure_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "uey8zbikPEGE", + "colab_type": "code", + "outputId": "bd7b5fab-5627-494d-920d-619351094e5b", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "bob._objects" + ], + "execution_count": 39, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{37759585635: tensor([1307486639978963936, 1772543521463849406, 755332784552035436,\n", + " 357352296837860548, 4558057761578889798])}" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 39 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-vYyCxO-PEGJ", + "colab_type": "text" + }, + "source": [ + "and as you can see, Bob now has one of the shares of x! Furthermore, we can still call addition in this state, and PySyft will automatically perform the remote execution for us!" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "LXjZ_7-aPEGK", + "colab_type": "code", + "colab": {} + }, + "source": [ + "y = x + x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "xB1ly4RCPEGT", + "colab_type": "code", + "outputId": "5510c37f-a0a8-4470-8431-d602d0780d06", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "y" + ], + "execution_count": 41, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>[AdditiveSharingTensor]\n", + "\t-> [PointerTensor | me:69153440530 -> bob:98296074170]\n", + "\t-> [PointerTensor | me:73409921479 -> alice:27482987306]\n", + "\t-> [PointerTensor | me:66221913917 -> secure_worker:37769418130]\n", + "\t*crypto provider: me*" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 41 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "VU3326WmPEGX", + "colab_type": "code", + "outputId": "5f83c558-d51e-42f2-97cc-0509bf706808", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "y.get()" + ], + "execution_count": 42, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([ 2, 4, 6, 8, 10])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 42 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tJXCqv8rPEGb", + "colab_type": "text" + }, + "source": [ + "### Fixed Precision using PySyft\n", + "\n", + "We can also convert a tensor to fixed precision using .fix_precision()" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "3WCUPYELPEGe", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([0.1,0.2,0.3])" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "9RE00HFfPEGh", + "colab_type": "code", + "outputId": "9bc71ee9-fd1e-4126-b447-0406875ba5f7", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "x" + ], + "execution_count": 44, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([0.1000, 0.2000, 0.3000])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 44 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "iQxfMwGtPEGl", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = x.fix_prec()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "pCnVrABpPEGp", + "colab_type": "code", + "outputId": "5086eadf-9aac-4a8b-c29a-2622e5b18992", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "x.child.child" + ], + "execution_count": 46, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([100, 200, 300])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 46 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "xeTMquXmPEGz", + "colab_type": "code", + "colab": {} + }, + "source": [ + "y = x + x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "4QJzQ434PEG1", + "colab_type": "code", + "outputId": "80c36fe9-86b6-47a3-cac1-f4354b373ca6", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "y = y.float_prec()\n", + "y" + ], + "execution_count": 48, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([0.2000, 0.4000, 0.6000])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 48 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9Py1ksttPEG6", + "colab_type": "text" + }, + "source": [ + "### Shared Fixed Precision\n", + "\n", + "And of course, we can combine the two!" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "frrUlGrFPEG6", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([0.1, 0.2, 0.3])" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "QCbtkqBgPEG9", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = x.fix_prec().share(bob, alice, secure_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "OtiLRUJLPEHH", + "colab_type": "code", + "colab": {} + }, + "source": [ + "y = x + x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "nrMqRW6pPEHK", + "colab_type": "code", + "outputId": "a27f666a-b76c-424b-a367-59ff6b24c1f3", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "y.get().float_prec()" + ], + "execution_count": 52, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([0.2000, 0.4000, 0.6000])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 52 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fuCIcwQIPEHV", + "colab_type": "text" + }, + "source": [ + "Make sure to make the point that people can see the model averages in the clear." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "tSi_WoAtPEHW", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x= th.tensor([0.1,0.2,0.3]).fix_prec().share(bob, alice, secure_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "8vHx8HmQjc4U", + "colab_type": "code", + "outputId": "2215d3c1-2714-45ae-b2a8-2c464f4a897c", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "x" + ], + "execution_count": 54, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>FixedPrecisionTensor>[AdditiveSharingTensor]\n", + "\t-> [PointerTensor | me:65152331605 -> bob:14688615297]\n", + "\t-> [PointerTensor | me:21721573050 -> alice:51531065930]\n", + "\t-> [PointerTensor | me:20883209566 -> secure_worker:22125518730]\n", + "\t*crypto provider: me*" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 54 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "XlkXNRLBjcXL", + "colab_type": "code", + "colab": {} + }, + "source": [ + "y = x + x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "b8dJmkHujntX", + "colab_type": "code", + "outputId": "35b44235-a384-4dfd-acfb-19f55aa1e024", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "y" + ], + "execution_count": 56, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(Wrapper)>FixedPrecisionTensor>[AdditiveSharingTensor]\n", + "\t-> [PointerTensor | me:25479128076 -> bob:50158886607]\n", + "\t-> [PointerTensor | me:53730434968 -> alice:12483556925]\n", + "\t-> [PointerTensor | me:9077176005 -> secure_worker:22669538592]\n", + "\t*crypto provider: me*" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 56 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "YoY31a2Tjrkp", + "colab_type": "code", + "colab": {} + }, + "source": [ + "y = y.get().float_prec()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Ilo7T8Dyjq-4", + "colab_type": "code", + "outputId": "26bee9a2-f024-4afe-c10e-a42e0e9aa16e", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 0 + } + }, + "source": [ + "y" + ], + "execution_count": 58, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "tensor([0.2000, 0.4000, 0.6000])" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 58 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nWzk0s78PEHh", + "colab_type": "text" + }, + "source": [ + "# Final Project: Federated Learning with Encrypted Gradient Aggregation" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "uTNoXuy0zRCo", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import torchvision.datasets as datasets\n" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "rml-bt0OPEHn", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 200 + }, + "outputId": "2a624178-7c22-4e54-a00f-e008fe3a3af0" + }, + "source": [ + "import syft as sy\n", + "hook = sy.TorchHook(torch)\n" + ], + "execution_count": 60, + "outputs": [ + { + "output_type": "error", + "ename": "NameError", + "evalue": "ignored", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0msyft\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0msy\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mhook\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0msy\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mTorchHook\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtorch\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mNameError\u001b[0m: name 'torch' is not defined" + ] + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "VEeN-kaW84wt", + "colab_type": "code", + "colab": {} + }, + "source": [ + "mnist_trainset = datasets.MNIST(root='./data', train=True, download=True, transform=None)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "H4x68TXSkrV_", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob = sy.VirtualWorker(hook, id=\"bob\") \n", + "alice = sy.VirtualWorker(hook, id=\"alice\")\n", + "secure_worker = sy.VirtualWorker(hook, id=\"secure_worker\")" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "o9ynPPfdkrSZ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob.clear_objects\n", + "alice.clear_objects\n", + "secure_worker.clear_objects\n", + "compute_nodes = [bob, alice]" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "0homh5GAkrQZ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "compute_nodes" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "LkqJ8MfllNKa", + "colab_type": "code", + "colab": {} + }, + "source": [ + "class Arguments():\n", + " def __init__(self):\n", + " self.batch_size = 64\n", + " self.test_batch_size = 10000\n", + " self.epochs = 50\n", + " self.lr = 0.01\n", + " self.momentum = 0.5\n", + " self.no_cuda = False\n", + " self.seed = 1\n", + " self.log_interval = 30\n", + " self.save_model = False\n", + "\n", + "args = Arguments()\n", + "\n", + "torch.manual_seed(args.seed)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "CJtYF0gHlVDK", + "colab_type": "code", + "colab": {} + }, + "source": [ + "transform=transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.1307,), (0.3081,))]) \n", + "mnist_trainset = datasets.MNIST(root='../data', train=True, download=True, transform=transform)\n", + "train_loader = torch.utils.data.DataLoader(mnist_trainset, batch_size=args.test_batch_size, shuffle=True)\n", + "\n", + "\n", + "mnist_testset = datasets.MNIST(root='../data', train=False, download=True, transform=transform)\n", + "test_loader = torch.utils.data.DataLoader(mnist_testset, batch_size=args.test_batch_" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "EokvFurPlWjN", + "colab_type": "code", + "colab": {} + }, + "source": [ + "train_distributed_dataset = []\n", + "\n", + "for batch_idx, (data,target) in enumerate(train_loader):\n", + " data = data.send(compute_nodes[batch_idx % len(compute_nodes)])\n", + " target = target.send(compute_nodes[batch_idx % len(compute_nodes)])\n", + " train_distributed_dataset.append((data, target))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "3_rPzJmqkrM6", + "colab_type": "code", + "colab": {} + }, + "source": [ + "from torch import nn, optim\n", + "import torch.nn.functional as F" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "T13G3nrokrJo", + "colab_type": "code", + "colab": {} + }, + "source": [ + "class Classifier(nn.Module):\n", + " \"\"\"\n", + " Forward Convolutional Neural Network Architecture model\n", + " \n", + " \"\"\"\n", + " def __init__(self):\n", + " super().__init__()\n", + " self.conv1 = nn.Conv2d(1, 20, 5, 1)\n", + " self.conv2 = nn.Conv2d(20, 50, 5, 1)\n", + " self.fc1 = nn.Linear(4*4*50, 500)\n", + " self.fc2 = nn.Linear(500, 10)\n", + "\n", + " def forward(self, x):\n", + " x = F.relu(self.conv1(x))\n", + " x = F.max_pool2d(x, 2, 2)\n", + " x = F.relu(self.conv2(x))\n", + " x = F.max_pool2d(x, 2, 2)\n", + " x = x.view(-1, 4*4*50)\n", + " x = F.relu(self.fc1(x))\n", + " x = self.fc2(x)\n", + " return F.log_softmax(x, dim=1)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "n-Ag0prokrGo", + "colab_type": "code", + "colab": {} + }, + "source": [ + "model = Classifier()\n", + "model = model.to(device)\n", + "optimizer = optim.SGD(model.parameters(), lr=args.lr)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "_ZvOLMVpkrDj", + "colab_type": "code", + "colab": {} + }, + "source": [ + "for epoch in range(1, args.epochs + 1):\n", + " model.train()\n", + " for batch_idx, (data, target) in enumerate(train_distributed_dataset): # iterate through each worker's dataset\n", + " \n", + " model.send(data.location) #send the model to the right location\n", + " \n", + " data, target = data.to(device), target.to(device)\n", + " \n", + " optimizer.zero_grad() # 1) erase previous gradients (if they exist)\n", + " output = model(data) # 2) make a prediction\n", + " loss = F.nll_loss(output, target) # 3) calculate how much we missed\n", + " loss.backward() # 4) figure out which weights caused us to miss\n", + " optimizer.step() # 5) change those weights\n", + " model.get() # get the model back (with gradients)\n", + " \n", + " if batch_idx % args.log_interval == 0:\n", + " loss = loss.get() #get the loss back\n", + " print('Train Epoch: {} [{}/{} ({:.0f}%)]\\tLoss: {:.6f}'.format(\n", + " epoch, batch_idx * data.shape[0], len(train_loader),\n", + " 100. * batch_idx / len(train_loader), loss.item()))" + ], + "execution_count": 0, + "outputs": [] + } + ] +} \ No newline at end of file From 2db9d09c31fb786b061a4157a0bf21c858c16d92 Mon Sep 17 00:00:00 2001 From: Nimanshi Jha <40792134+nimanshijha@users.noreply.github.com> Date: Sat, 24 Aug 2019 19:46:01 +0530 Subject: [PATCH 09/14] Delete Section 4 - Encrypted Deep Learning.ipynb --- Section 4 - Encrypted Deep Learning.ipynb | 1680 --------------------- 1 file changed, 1680 deletions(-) delete mode 100644 Section 4 - Encrypted Deep Learning.ipynb diff --git a/Section 4 - Encrypted Deep Learning.ipynb b/Section 4 - Encrypted Deep Learning.ipynb deleted file mode 100644 index a396bc6..0000000 --- a/Section 4 - Encrypted Deep Learning.ipynb +++ /dev/null @@ -1,1680 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Section: Encrypted Deep Learning\n", - "\n", - "- Lesson: Reviewing Additive Secret Sharing\n", - "- Lesson: Encrypted Subtraction and Public/Scalar Multiplication\n", - "- Lesson: Encrypted Computation in PySyft\n", - "- Project: Build an Encrypted Database\n", - "- Lesson: Encrypted Deep Learning in PyTorch\n", - "- Lesson: Encrypted Deep Learning in Keras\n", - "- Final Project" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Reviewing Additive Secret Sharing\n", - "\n", - "_For more great information about SMPC protocols like this one, visit https://mortendahl.github.io. With permission, Morten's work directly inspired this first teaching segment._" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import random\n", - "import numpy as np\n", - "\n", - "BASE = 10\n", - "\n", - "PRECISION_INTEGRAL = 8\n", - "PRECISION_FRACTIONAL = 8\n", - "Q = 293973345475167247070445277780365744413\n", - "\n", - "PRECISION = PRECISION_INTEGRAL + PRECISION_FRACTIONAL\n", - "\n", - "assert(Q > BASE**PRECISION)\n", - "\n", - "def encode(rational):\n", - " upscaled = int(rational * BASE**PRECISION_FRACTIONAL)\n", - " field_element = upscaled % Q\n", - " return field_element\n", - "\n", - "def decode(field_element):\n", - " upscaled = field_element if field_element <= Q/2 else field_element - Q\n", - " rational = upscaled / BASE**PRECISION_FRACTIONAL\n", - " return rational\n", - "\n", - "def encrypt(secret):\n", - " first = random.randrange(Q)\n", - " second = random.randrange(Q)\n", - " third = (secret - first - second) % Q\n", - " return [first, second, third]\n", - "\n", - "def decrypt(sharing):\n", - " return sum(sharing) % Q\n", - "\n", - "def add(a, b):\n", - " c = list()\n", - " for i in range(len(a)):\n", - " c.append((a[i] + b[i]) % Q)\n", - " return tuple(c)" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[157261321043261917823065844237599434634,\n", - " 132977400494673236547132219369279939003,\n", - " 3734623937232092700247214174036370776]" - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x = encrypt(encode(5.5))\n", - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[154234279162428625260264272011868616963,\n", - " 160039428260900034921952397194418692688,\n", - " 273672983527005833958673886354674179174]" - ] - }, - "execution_count": 3, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "y = encrypt(encode(2.3))\n", - "y" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(17522254730523296012884838469102307184,\n", - " 293016828755573271469084616563698631691,\n", - " 277407607464237926658921100528710549950)" - ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = add(x,y)\n", - "z" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "7.79999999" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "decode(decrypt(z))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Encrypted Subtraction and Public/Scalar Multiplication" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [], - "source": [ - "field = 23740629843760239486723" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [], - "source": [ - "x = 5\n", - "\n", - "bob_x_share = 2372385723 # random number\n", - "alices_x_share = field - bob_x_share + x" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "5" - ] - }, - "execution_count": 8, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "(bob_x_share + alices_x_share) % field" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [], - "source": [ - "field = 10\n", - "\n", - "x = 5\n", - "\n", - "bob_x_share = 8\n", - "alice_x_share = field - bob_x_share + x\n", - "\n", - "y = 1\n", - "\n", - "bob_y_share = 9\n", - "alice_y_share = field - bob_y_share + y" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "4" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "((bob_x_share + alice_x_share) - (bob_y_share + alice_y_share)) % field" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "4" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "((bob_x_share - bob_y_share) + (alice_x_share - alice_y_share)) % field" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "26" - ] - }, - "execution_count": 12, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob_x_share + alice_x_share + bob_y_share + alice_y_share" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [], - "source": [ - "bob_z_share = (bob_x_share - bob_y_share)\n", - "alice_z_share = (alice_x_share - alice_y_share)" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "4" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "(bob_z_share + alice_z_share) % field" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "def sub(a, b):\n", - " c = list()\n", - " for i in range(len(a)):\n", - " c.append((a[i] - b[i]) % Q)\n", - " return tuple(c)" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [], - "source": [ - "field = 10\n", - "\n", - "x = 5\n", - "\n", - "bob_x_share = 8\n", - "alice_x_share = field - bob_x_share + x\n", - "\n", - "y = 1\n", - "\n", - "bob_y_share = 9\n", - "alice_y_share = field - bob_y_share + y" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "15" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob_x_share + alice_x_share" - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "11" - ] - }, - "execution_count": 18, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob_y_share + alice_y_share" - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "3" - ] - }, - "execution_count": 19, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "((bob_y_share * 3) + (alice_y_share * 3)) % field" - ] - }, - { - "cell_type": "code", - "execution_count": 20, - "metadata": {}, - "outputs": [], - "source": [ - "def imul(a, scalar):\n", - " \n", - " # logic here which can multiply by a public scalar\n", - " \n", - " c = list()\n", - " \n", - " for i in range(len(a)):\n", - " c.append((a[i] * scalar) % Q)\n", - " \n", - " return tuple(c)" - ] - }, - { - "cell_type": "code", - "execution_count": 21, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[102429626723126324886479814859023395389,\n", - " 119121291581510607047487749044084540899,\n", - " 72422427170530315136477713877807808125]" - ] - }, - "execution_count": 21, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x = encrypt(encode(5.5))\n", - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 22, - "metadata": {}, - "outputs": [], - "source": [ - "z = imul(x, 3)" - ] - }, - { - "cell_type": "code", - "execution_count": 23, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "16.5" - ] - }, - "execution_count": 23, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "decode(decrypt(z))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Encrypted Computation in PySyft" - ] - }, - { - "cell_type": "code", - "execution_count": 24, - "metadata": {}, - "outputs": [], - "source": [ - "import syft as sy\n", - "import torch as th\n", - "hook = sy.TorchHook(th)\n", - "from torch import nn, optim" - ] - }, - { - "cell_type": "code", - "execution_count": 25, - "metadata": {}, - "outputs": [], - "source": [ - "bob = sy.VirtualWorker(hook, id=\"bob\").add_worker(sy.local_worker)\n", - "alice = sy.VirtualWorker(hook, id=\"alice\").add_worker(sy.local_worker)\n", - "secure_worker = sy.VirtualWorker(hook, id=\"secure_worker\").add_worker(sy.local_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 26, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4])\n", - "y = th.tensor([2,-1,1,0])" - ] - }, - { - "cell_type": "code", - "execution_count": 27, - "metadata": {}, - "outputs": [], - "source": [ - "x = x.share(bob, alice, crypto_provider=secure_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 28, - "metadata": {}, - "outputs": [], - "source": [ - "y = y.share(bob, alice, crypto_provider=secure_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 29, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([3, 1, 4, 4])" - ] - }, - "execution_count": 29, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = x + y\n", - "z.get()" - ] - }, - { - "cell_type": "code", - "execution_count": 30, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([-1, 3, 2, 4])" - ] - }, - "execution_count": 30, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = x - y\n", - "z.get()" - ] - }, - { - "cell_type": "code", - "execution_count": 31, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([ 2, -2, 3, 0])" - ] - }, - "execution_count": 31, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = x * y\n", - "z.get()" - ] - }, - { - "cell_type": "code", - "execution_count": 32, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([0, 1, 1, 1])" - ] - }, - "execution_count": 32, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = x > y\n", - "z.get()" - ] - }, - { - "cell_type": "code", - "execution_count": 33, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([1, 0, 0, 0])" - ] - }, - "execution_count": 33, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = x < y\n", - "z.get()" - ] - }, - { - "cell_type": "code", - "execution_count": 34, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([0, 0, 0, 0])" - ] - }, - "execution_count": 34, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = x == y\n", - "z.get()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 35, - "metadata": {}, - "outputs": [ - { - "ename": "RuntimeError", - "evalue": "log2_vml_cpu not implemented for 'Long'", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mRuntimeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mth\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtensor\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfix_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshare\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbob\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0malice\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcrypto_provider\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0msecure_worker\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 5\u001b[0m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfix_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshare\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbob\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0malice\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcrypto_provider\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0msecure_worker\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/tensors/interpreters/native.py\u001b[0m in \u001b[0;36mfix_prec\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 643\u001b[0m \u001b[0mprec_fractional\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"precision_fractional\"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 644\u001b[0m \u001b[0mmax_precision\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_get_maximum_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 645\u001b[0;31m \u001b[0;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_requires_large_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmax_precision\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbase\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mprec_fractional\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 646\u001b[0m return (\n\u001b[1;32m 647\u001b[0m \u001b[0msyft\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mLargePrecisionTensor\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/tensors/interpreters/native.py\u001b[0m in \u001b[0;36m_requires_large_precision\u001b[0;34m(self, max_precision, base, precision_fractional)\u001b[0m\n\u001b[1;32m 666\u001b[0m \"\"\"\n\u001b[1;32m 667\u001b[0m \u001b[0mbase_fractional\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlog2\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbase\u001b[0m \u001b[0;34m**\u001b[0m \u001b[0mprecision_fractional\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 668\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0many\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mabs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlog2\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mbase_fractional\u001b[0m \u001b[0;34m>\u001b[0m \u001b[0mmax_precision\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 669\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 670\u001b[0m def share(\n", - "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/hook/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 661\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mBaseException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 662\u001b[0m \u001b[0;31m# we can make some errors more descriptive with this method\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 663\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mroute_method_exception\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 664\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 665\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;31m# means that there is a wrapper to remove\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/hook/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 655\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 656\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtuple\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 657\u001b[0;31m \u001b[0mresponse\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmethod\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 658\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 659\u001b[0m \u001b[0mresponse\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmethod\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mRuntimeError\u001b[0m: log2_vml_cpu not implemented for 'Long'" - ] - } - ], - "source": [ - "x = th.tensor([1,2,3,4])\n", - "y = th.tensor([2,-1,1,0])\n", - "\n", - "x = x.fix_precision().share(bob, alice, crypto_provider=secure_worker)\n", - "y = y.fix_precision().share(bob, alice, crypto_provider=secure_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 36, - "metadata": {}, - "outputs": [ - { - "ename": "AttributeError", - "evalue": "'Tensor' object has no attribute 'child'", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mz\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mz\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfloat_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/tensors/interpreters/native.py\u001b[0m in \u001b[0;36mget\u001b[0;34m(self, inplace, *args, **kwargs)\u001b[0m\n\u001b[1;32m 555\u001b[0m \u001b[0;31m# return self\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 556\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 557\u001b[0;31m \u001b[0mtensor\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchild\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 558\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 559\u001b[0m \u001b[0;31m# Clean the wrapper\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mAttributeError\u001b[0m: 'Tensor' object has no attribute 'child'" - ] - } - ], - "source": [ - "z = x + y\n", - "z.get().float_precision()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "z = x - y\n", - "z.get().float_precision()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "z = x * y\n", - "z.get().float_precision()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "z = x > y\n", - "z.get().float_precision()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "z = x < y\n", - "z.get().float_precision()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "z = x == y\n", - "z.get().float_precision()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project: Build an Encrypted Database" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# try this project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Encrypted Deep Learning in PyTorch" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Train a Model" - ] - }, - { - "cell_type": "code", - "execution_count": 37, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "tensor(0.9441)\n", - "tensor(1.0317)\n", - "tensor(1.5362)\n", - "tensor(1.5309)\n", - "tensor(1.6070)\n", - "tensor(0.4302)\n", - "tensor(0.1941)\n", - "tensor(0.0789)\n", - "tensor(0.0409)\n", - "tensor(0.0215)\n", - "tensor(0.0136)\n", - "tensor(0.0094)\n", - "tensor(0.0068)\n", - "tensor(0.0052)\n", - "tensor(0.0041)\n", - "tensor(0.0034)\n", - "tensor(0.0028)\n", - "tensor(0.0024)\n", - "tensor(0.0020)\n", - "tensor(0.0017)\n" - ] - } - ], - "source": [ - "from torch import nn\n", - "from torch import optim\n", - "import torch.nn.functional as F\n", - "\n", - "# A Toy Dataset\n", - "data = th.tensor([[0,0],[0,1],[1,0],[1,1.]], requires_grad=True)\n", - "target = th.tensor([[0],[0],[1],[1.]], requires_grad=True)\n", - "\n", - "class Net(nn.Module):\n", - " def __init__(self):\n", - " super(Net, self).__init__()\n", - " self.fc1 = nn.Linear(2, 20)\n", - " self.fc2 = nn.Linear(20, 1)\n", - "\n", - " def forward(self, x):\n", - " x = self.fc1(x)\n", - " x = F.relu(x)\n", - " x = self.fc2(x)\n", - " return x\n", - "\n", - "# A Toy Model\n", - "model = Net()\n", - "\n", - "def train():\n", - " # Training Logic\n", - " opt = optim.SGD(params=model.parameters(),lr=0.1)\n", - " for iter in range(20):\n", - "\n", - " # 1) erase previous gradients (if they exist)\n", - " opt.zero_grad()\n", - "\n", - " # 2) make a prediction\n", - " pred = model(data)\n", - "\n", - " # 3) calculate how much we missed\n", - " loss = ((pred - target)**2).sum()\n", - "\n", - " # 4) figure out which weights caused us to miss\n", - " loss.backward()\n", - "\n", - " # 5) change those weights\n", - " opt.step()\n", - "\n", - " # 6) print our progress\n", - " print(loss.data)\n", - " \n", - "train()" - ] - }, - { - "cell_type": "code", - "execution_count": 38, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([[ 0.0140],\n", - " [-0.0166],\n", - " [ 0.9727],\n", - " [ 1.0159]], grad_fn=)" - ] - }, - "execution_count": 38, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "model(data)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Encrypt the Model and Data" - ] - }, - { - "cell_type": "code", - "execution_count": 39, - "metadata": {}, - "outputs": [], - "source": [ - "encrypted_model = model.fix_precision().share(alice, bob, crypto_provider=secure_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 40, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[Parameter containing:\n", - " Parameter>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", - " \t-> (Wrapper)>[PointerTensor | me:85708028434 -> alice:81267402168]\n", - " \t-> (Wrapper)>[PointerTensor | me:62441055856 -> bob:99786602276]\n", - " \t*crypto provider: secure_worker*, Parameter containing:\n", - " Parameter>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", - " \t-> (Wrapper)>[PointerTensor | me:3433679299 -> alice:86383455980]\n", - " \t-> (Wrapper)>[PointerTensor | me:95582447050 -> bob:21797858764]\n", - " \t*crypto provider: secure_worker*, Parameter containing:\n", - " Parameter>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", - " \t-> (Wrapper)>[PointerTensor | me:10008001814 -> alice:42945910810]\n", - " \t-> (Wrapper)>[PointerTensor | me:71956775381 -> bob:2075238434]\n", - " \t*crypto provider: secure_worker*, Parameter containing:\n", - " Parameter>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", - " \t-> (Wrapper)>[PointerTensor | me:94360348394 -> alice:9472537528]\n", - " \t-> (Wrapper)>[PointerTensor | me:44614305427 -> bob:98758467549]\n", - " \t*crypto provider: secure_worker*]" - ] - }, - "execution_count": 40, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list(encrypted_model.parameters())" - ] - }, - { - "cell_type": "code", - "execution_count": 41, - "metadata": {}, - "outputs": [], - "source": [ - "encrypted_data = data.fix_precision().share(alice, bob, crypto_provider=secure_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 42, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", - "\t-> (Wrapper)>[PointerTensor | me:90895649049 -> alice:241475900]\n", - "\t-> (Wrapper)>[PointerTensor | me:45527513617 -> bob:69153393355]\n", - "\t*crypto provider: secure_worker*" - ] - }, - "execution_count": 42, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "encrypted_data" - ] - }, - { - "cell_type": "code", - "execution_count": 43, - "metadata": {}, - "outputs": [], - "source": [ - "encrypted_prediction = encrypted_model(encrypted_data)" - ] - }, - { - "cell_type": "code", - "execution_count": 44, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([[ 0.0130],\n", - " [-0.0160],\n", - " [ 0.9700],\n", - " [ 1.0140]])" - ] - }, - "execution_count": 44, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "encrypted_prediction.get().float_precision()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Encrypted Deep Learning in Keras\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 1: Public Training\n", - "\n", - "Welcome to this tutorial! In the following notebooks you will learn how to provide private predictions. By private predictions, we mean that the data is constantly encrypted throughout the entire process. At no point is the user sharing raw data, only encrypted (that is, secret shared) data. In order to provide these private predictions, Syft Keras uses a library called [TF Encrypted](https://github.com/tf-encrypted/tf-encrypted) under the hood. TF Encrypted combines cutting-edge cryptographic and machine learning techniques, but you don't have to worry about this and can focus on your machine learning application.\n", - "\n", - "You can start serving private predictions with only three steps:\n", - "- **Step 1**: train your model with normal Keras.\n", - "- **Step 2**: secure and serve your machine learning model (server).\n", - "- **Step 3**: query the secured model to receive private predictions (client). \n", - "\n", - "Alright, let's go through these three steps so you can deploy impactful machine learning services without sacrificing user privacy or model security.\n", - "\n", - "Huge shoutout to the Dropout Labs ([@dropoutlabs](https://twitter.com/dropoutlabs)) and TF Encrypted ([@tf_encrypted](https://twitter.com/tf_encrypted)) teams for their great work which makes this demo possible, especially: Jason Mancuso ([@jvmancuso](https://twitter.com/jvmancuso)), Yann Dupis ([@YannDupis](https://twitter.com/YannDupis)), and Morten Dahl ([@mortendahlcs](https://github.com/mortendahlcs)). \n", - "\n", - "_Demo Ref: https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials_" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Train Your Model in Keras\n", - "\n", - "To use privacy-preserving machine learning techniques for your projects you should not have to learn a new machine learning framework. If you have basic [Keras](https://keras.io/) knowledge, you can start using these techniques with Syft Keras. If you have never used Keras before, you can learn a bit more about it through the [Keras documentation](https://keras.io). \n", - "\n", - "Before serving private predictions, the first step is to train your model with normal Keras. As an example, we will train a model to classify handwritten digits. To train this model we will use the canonical [MNIST dataset](http://yann.lecun.com/exdb/mnist/).\n", - "\n", - "We borrow [this example](https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py) from the reference Keras repository. To train your classification model, you just run the cell below." - ] - }, - { - "cell_type": "code", - "execution_count": 45, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "x_train shape: (60000, 28, 28, 1)\n", - "60000 train samples\n", - "10000 test samples\n", - "WARNING:tensorflow:From /usr/local/miniconda3/envs/syft/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\n", - "Instructions for updating:\n", - "Colocations handled automatically by placer.\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "WARNING:tensorflow:From /usr/local/miniconda3/envs/syft/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\n", - "Instructions for updating:\n", - "Colocations handled automatically by placer.\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Train on 60000 samples, validate on 10000 samples\n", - "WARNING:tensorflow:From /usr/local/miniconda3/envs/syft/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n", - "Instructions for updating:\n", - "Use tf.cast instead.\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "WARNING:tensorflow:From /usr/local/miniconda3/envs/syft/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n", - "Instructions for updating:\n", - "Use tf.cast instead.\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Epoch 1/2\n", - "60000/60000 [==============================] - 26s 440us/sample - loss: 0.7004 - acc: 0.7795 - val_loss: 0.3245 - val_acc: 0.9011\n", - "Epoch 2/2\n", - "60000/60000 [==============================] - 22s 361us/sample - loss: 0.2265 - acc: 0.9311 - val_loss: 0.1698 - val_acc: 0.9487\n", - "Test loss: 0.1698406898036599\n", - "Test accuracy: 0.9487\n" - ] - } - ], - "source": [ - "from __future__ import print_function\n", - "import tensorflow.keras as keras\n", - "from tensorflow.keras.datasets import mnist\n", - "from tensorflow.keras.models import Sequential\n", - "from tensorflow.keras.layers import Dense, Dropout, Flatten\n", - "from tensorflow.keras.layers import Conv2D, AveragePooling2D\n", - "from tensorflow.keras.layers import Activation\n", - "\n", - "batch_size = 128\n", - "num_classes = 10\n", - "epochs = 2\n", - "\n", - "# input image dimensions\n", - "img_rows, img_cols = 28, 28\n", - "\n", - "# the data, split between train and test sets\n", - "(x_train, y_train), (x_test, y_test) = mnist.load_data()\n", - "\n", - "x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n", - "x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n", - "input_shape = (img_rows, img_cols, 1)\n", - "\n", - "x_train = x_train.astype('float32')\n", - "x_test = x_test.astype('float32')\n", - "x_train /= 255\n", - "x_test /= 255\n", - "print('x_train shape:', x_train.shape)\n", - "print(x_train.shape[0], 'train samples')\n", - "print(x_test.shape[0], 'test samples')\n", - "\n", - "# convert class vectors to binary class matrices\n", - "y_train = keras.utils.to_categorical(y_train, num_classes)\n", - "y_test = keras.utils.to_categorical(y_test, num_classes)\n", - "\n", - "model = Sequential()\n", - "\n", - "model.add(Conv2D(10, (3, 3), input_shape=input_shape))\n", - "model.add(AveragePooling2D((2, 2)))\n", - "model.add(Activation('relu'))\n", - "model.add(Conv2D(32, (3, 3)))\n", - "model.add(AveragePooling2D((2, 2)))\n", - "model.add(Activation('relu'))\n", - "model.add(Conv2D(64, (3, 3)))\n", - "model.add(AveragePooling2D((2, 2)))\n", - "model.add(Activation('relu'))\n", - "model.add(Flatten())\n", - "model.add(Dense(num_classes, activation='softmax'))\n", - "\n", - "model.compile(loss=keras.losses.categorical_crossentropy,\n", - " optimizer=keras.optimizers.Adadelta(),\n", - " metrics=['accuracy'])\n", - "\n", - "model.fit(x_train, y_train,\n", - " batch_size=batch_size,\n", - " epochs=epochs,\n", - " verbose=1,\n", - " validation_data=(x_test, y_test))\n", - "score = model.evaluate(x_test, y_test, verbose=0)\n", - "print('Test loss:', score[0])\n", - "print('Test accuracy:', score[1])" - ] - }, - { - "cell_type": "code", - "execution_count": 46, - "metadata": {}, - "outputs": [], - "source": [ - "## Save your model's weights for future private prediction\n", - "model.save('short-conv-mnist.h5')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 2: Load and Serve the Model" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now that you have a trained model with normal Keras, you are ready to serve some private predictions. We can do that using Syft Keras.\n", - "\n", - "To secure and serve this model, we will need three TFEWorkers (servers). This is because TF Encrypted under the hood uses an encryption technique called [multi-party computation (MPC)](https://en.wikipedia.org/wiki/Secure_multi-party_computation). The idea is to split the model weights and input data into shares, then send a share of each value to the different servers. The key property is that if you look at the share on one server, it reveals nothing about the original value (input data or model weights).\n", - "\n", - "We'll define a Syft Keras model like we did in the previous notebook. However, there is a trick: before instantiating this model, we'll run `hook = sy.KerasHook(tf.keras)`. This will add three important new methods to the Keras Sequential class:\n", - " - `share`: will secure your model via secret sharing; by default, it will use the SecureNN protocol from TF Encrypted to secret share your model between each of the three TFEWorkers. Most importantly, this will add the capability of providing predictions on encrypted data.\n", - " - `serve`: this function will launch a serving queue, so that the TFEWorkers can can accept prediction requests on the secured model from external clients.\n", - " - `shutdown_workers`: once you are done providing private predictions, you can shut down your model by running this function. It will direct you to shutdown the server processes manually if you've opted to manually manage each worker.\n", - "\n", - "If you want learn more about MPC, you can read this excellent [blog](https://mortendahl.github.io/2017/04/17/private-deep-learning-with-mpc/)." - ] - }, - { - "cell_type": "code", - "execution_count": 47, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import tensorflow as tf\n", - "from tensorflow.keras import Sequential\n", - "from tensorflow.keras.layers import AveragePooling2D, Conv2D, Dense, Activation, Flatten, ReLU, Activation\n", - "\n", - "import syft as sy\n", - "hook = sy.KerasHook(tf.keras)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Model\n", - "\n", - "As you can see, we define almost the exact same model as before, except we provide a `batch_input_shape`. This allows TF Encrypted to better optimize the secure computations via predefined tensor shapes. For this MNIST demo, we'll send input data with the shape of (1, 28, 28, 1). \n", - "We also return the logit instead of softmax because this operation is complex to perform using MPC, and we don't need it to serve prediction requests." - ] - }, - { - "cell_type": "code", - "execution_count": 48, - "metadata": {}, - "outputs": [], - "source": [ - "num_classes = 10\n", - "input_shape = (1, 28, 28, 1)\n", - "\n", - "model = Sequential()\n", - "\n", - "model.add(Conv2D(10, (3, 3), batch_input_shape=input_shape))\n", - "model.add(AveragePooling2D((2, 2)))\n", - "model.add(Activation('relu'))\n", - "model.add(Conv2D(32, (3, 3)))\n", - "model.add(AveragePooling2D((2, 2)))\n", - "model.add(Activation('relu'))\n", - "model.add(Conv2D(64, (3, 3)))\n", - "model.add(AveragePooling2D((2, 2)))\n", - "model.add(Activation('relu'))\n", - "model.add(Flatten())\n", - "model.add(Dense(num_classes, name=\"logit\"))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Load Pre-trained Weights\n", - "\n", - "With `load_weights` you can easily load the weights you have saved previously after training your model." - ] - }, - { - "cell_type": "code", - "execution_count": 49, - "metadata": {}, - "outputs": [], - "source": [ - "pre_trained_weights = 'short-conv-mnist.h5'\n", - "model.load_weights(pre_trained_weights)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 3: Setup Your Worker Connectors\n", - "\n", - "Let's now connect to the TFEWorkers (`alice`, `bob`, and `carol`) required by TF Encrypted to perform private predictions. For each TFEWorker, you just have to specify a host. We then make combine these workers in a cluster.\n", - "\n", - "These workers run a [TensorFlow server](https://www.tensorflow.org/api_docs/python/tf/distribute/Server), which you can either manage manually (`AUTO = False`) or ask the workers to manage for you (`AUTO = True`). If choosing to manually manage them, you will be instructed to execute a terminal command on each worker's host device after calling `model.share()` below. If all workers are hosted on a single device (e.g. `localhost`), you can choose to have Syft automatically manage the worker's TensorFlow server." - ] - }, - { - "cell_type": "code", - "execution_count": 50, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "INFO:tf_encrypted:If not done already, please launch the following command in a terminal on host localhost:4000: 'python -m tf_encrypted.player --config /var/folders/mh/7hh_sz1d3532_k9w4kcmnt8c0000gn/T/tfe.config server0'\n", - "This can be done automatically in a local subprocess by setting `auto_managed=True` when instantiating a TFEWorker.\n", - "\n", - "INFO:tf_encrypted:If not done already, please launch the following command in a terminal on host localhost:4001: 'python -m tf_encrypted.player --config /var/folders/mh/7hh_sz1d3532_k9w4kcmnt8c0000gn/T/tfe.config server1'\n", - "This can be done automatically in a local subprocess by setting `auto_managed=True` when instantiating a TFEWorker.\n", - "\n", - "INFO:tf_encrypted:If not done already, please launch the following command in a terminal on host localhost:4002: 'python -m tf_encrypted.player --config /var/folders/mh/7hh_sz1d3532_k9w4kcmnt8c0000gn/T/tfe.config server2'\n", - "This can be done automatically in a local subprocess by setting `auto_managed=True` when instantiating a TFEWorker.\n", - "\n" - ] - } - ], - "source": [ - "AUTO = False\n", - "\n", - "alice = sy.TFEWorker(host='localhost:4000', auto_managed=AUTO)\n", - "bob = sy.TFEWorker(host='localhost:4001', auto_managed=AUTO)\n", - "carol = sy.TFEWorker(host='localhost:4002', auto_managed=AUTO)\n", - "\n", - "cluster = sy.TFECluster(alice, bob, carol)\n", - "cluster.start()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 4: Launch 3 Servers\n", - "\n", - "If you have chosen to manually control the workers (i.e. `AUTO = False`) then you now need to launch 3 servers. Look for the exact commands to run in the info messages printed above. You are looking for the following, where the `...` are actual file paths:\n", - "\n", - "- `python -m tf_encrypted.player --config ... server0`\n", - "- `python -m tf_encrypted.player --config ... server1`\n", - "- `python -m tf_encrypted.player --config ... server2`" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 5: Split the Model Into Shares\n", - "\n", - "Thanks to `sy.KerasHook(tf.keras)` you can call the `share` method to transform your model into a TF Encrypted Keras model.\n", - "\n", - "If you have asked to manually manage servers above then this step will not complete until they have all been launched. Note that your firewall may ask for Python to accept incoming connection." - ] - }, - { - "cell_type": "code", - "execution_count": 51, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "INFO:tf_encrypted:Starting session on target 'grpc://localhost:4000' using config graph_options {\n", - "}\n", - "\n" - ] - } - ], - "source": [ - "model.share(cluster)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 6: Serve the Model\n", - "\n", - "Perfect! Now by calling `model.serve`, your model is ready to provide some private predictions. You can set `num_requests` to set a limit on the number of predictions requests served by the model; if not specified then the model will be served until interrupted." - ] - }, - { - "cell_type": "code", - "execution_count": 52, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Served encrypted prediction 1 to client.\n", - "Served encrypted prediction 2 to client.\n", - "Served encrypted prediction 3 to client.\n" - ] - } - ], - "source": [ - "model.serve(num_requests=3)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 7: Run the Client\n", - "\n", - "At this point open up and run the companion notebook: Section 4b - Encrytped Keras Client" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 8: Shutdown the Servers\n", - "\n", - "Once your request limit above, the model will no longer be available for serving requests, but it's still secret shared between the three workers above. You can kill the workers by executing the cell below.\n", - "\n", - "**Congratulations** on finishing Part 12: Secure Classification with Syft Keras and TFE!" - ] - }, - { - "cell_type": "code", - "execution_count": 53, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "INFO:tf_encrypted:Please terminate the process on host 'localhost:4000'.\n", - "INFO:tf_encrypted:Please terminate the process on host 'localhost:4001'.\n", - "INFO:tf_encrypted:Please terminate the process on host 'localhost:4002'.\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Process ID 15442 has been killed.\n", - "Process ID 15438 has been killed.\n", - "Process ID 15432 has been killed.\n" - ] - } - ], - "source": [ - "model.stop()\n", - "cluster.stop()\n", - "\n", - "if not AUTO:\n", - " process_ids = !ps aux | grep '[p]ython -m tf_encrypted.player --config' | awk '{print $2}'\n", - " for process_id in process_ids:\n", - " !kill {process_id}\n", - " print(\"Process ID {id} has been killed.\".format(id=process_id))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Keystone Project - Mix and Match What You've Learned\n", - "\n", - "Description: Take two of the concepts you've learned about in this course (Encrypted Computation, Federated Learning, Differential Privacy) and combine them for a use case of your own design. Extra credit if you can get your demo working with [WebSocketWorkers](https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials/advanced/websockets-example-MNIST) instead of VirtualWorkers! Then take your demo or example application, write a blogpost, and share that blogpost in #general-discussion on OpenMined's slack!!!\n", - "\n", - "Inspiration:\n", - "- This Course's Code: https://github.com/Udacity/private-ai\n", - "- OpenMined's Tutorials: https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials\n", - "- OpenMined's Blog: https://blog.openmined.org" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} From 8e464e2acdd1e038ccdea976b02fcdbe76e82599 Mon Sep 17 00:00:00 2001 From: Nimanshi Jha <40792134+nimanshijha@users.noreply.github.com> Date: Mon, 26 Aug 2019 21:59:37 +0530 Subject: [PATCH 10/14] Add files via upload --- Section 4 - Encrypted Deep Learning.ipynb | 1680 +++++++++++++++++++++ 1 file changed, 1680 insertions(+) create mode 100644 Section 4 - Encrypted Deep Learning.ipynb diff --git a/Section 4 - Encrypted Deep Learning.ipynb b/Section 4 - Encrypted Deep Learning.ipynb new file mode 100644 index 0000000..a396bc6 --- /dev/null +++ b/Section 4 - Encrypted Deep Learning.ipynb @@ -0,0 +1,1680 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Section: Encrypted Deep Learning\n", + "\n", + "- Lesson: Reviewing Additive Secret Sharing\n", + "- Lesson: Encrypted Subtraction and Public/Scalar Multiplication\n", + "- Lesson: Encrypted Computation in PySyft\n", + "- Project: Build an Encrypted Database\n", + "- Lesson: Encrypted Deep Learning in PyTorch\n", + "- Lesson: Encrypted Deep Learning in Keras\n", + "- Final Project" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Lesson: Reviewing Additive Secret Sharing\n", + "\n", + "_For more great information about SMPC protocols like this one, visit https://mortendahl.github.io. With permission, Morten's work directly inspired this first teaching segment._" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import random\n", + "import numpy as np\n", + "\n", + "BASE = 10\n", + "\n", + "PRECISION_INTEGRAL = 8\n", + "PRECISION_FRACTIONAL = 8\n", + "Q = 293973345475167247070445277780365744413\n", + "\n", + "PRECISION = PRECISION_INTEGRAL + PRECISION_FRACTIONAL\n", + "\n", + "assert(Q > BASE**PRECISION)\n", + "\n", + "def encode(rational):\n", + " upscaled = int(rational * BASE**PRECISION_FRACTIONAL)\n", + " field_element = upscaled % Q\n", + " return field_element\n", + "\n", + "def decode(field_element):\n", + " upscaled = field_element if field_element <= Q/2 else field_element - Q\n", + " rational = upscaled / BASE**PRECISION_FRACTIONAL\n", + " return rational\n", + "\n", + "def encrypt(secret):\n", + " first = random.randrange(Q)\n", + " second = random.randrange(Q)\n", + " third = (secret - first - second) % Q\n", + " return [first, second, third]\n", + "\n", + "def decrypt(sharing):\n", + " return sum(sharing) % Q\n", + "\n", + "def add(a, b):\n", + " c = list()\n", + " for i in range(len(a)):\n", + " c.append((a[i] + b[i]) % Q)\n", + " return tuple(c)" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[157261321043261917823065844237599434634,\n", + " 132977400494673236547132219369279939003,\n", + " 3734623937232092700247214174036370776]" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "x = encrypt(encode(5.5))\n", + "x" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[154234279162428625260264272011868616963,\n", + " 160039428260900034921952397194418692688,\n", + " 273672983527005833958673886354674179174]" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "y = encrypt(encode(2.3))\n", + "y" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(17522254730523296012884838469102307184,\n", + " 293016828755573271469084616563698631691,\n", + " 277407607464237926658921100528710549950)" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "z = add(x,y)\n", + "z" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "7.79999999" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "decode(decrypt(z))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Lesson: Encrypted Subtraction and Public/Scalar Multiplication" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "field = 23740629843760239486723" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "x = 5\n", + "\n", + "bob_x_share = 2372385723 # random number\n", + "alices_x_share = field - bob_x_share + x" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "5" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "(bob_x_share + alices_x_share) % field" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "field = 10\n", + "\n", + "x = 5\n", + "\n", + "bob_x_share = 8\n", + "alice_x_share = field - bob_x_share + x\n", + "\n", + "y = 1\n", + "\n", + "bob_y_share = 9\n", + "alice_y_share = field - bob_y_share + y" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "((bob_x_share + alice_x_share) - (bob_y_share + alice_y_share)) % field" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "((bob_x_share - bob_y_share) + (alice_x_share - alice_y_share)) % field" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "26" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bob_x_share + alice_x_share + bob_y_share + alice_y_share" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [], + "source": [ + "bob_z_share = (bob_x_share - bob_y_share)\n", + "alice_z_share = (alice_x_share - alice_y_share)" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "(bob_z_share + alice_z_share) % field" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [], + "source": [ + "def sub(a, b):\n", + " c = list()\n", + " for i in range(len(a)):\n", + " c.append((a[i] - b[i]) % Q)\n", + " return tuple(c)" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [], + "source": [ + "field = 10\n", + "\n", + "x = 5\n", + "\n", + "bob_x_share = 8\n", + "alice_x_share = field - bob_x_share + x\n", + "\n", + "y = 1\n", + "\n", + "bob_y_share = 9\n", + "alice_y_share = field - bob_y_share + y" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "15" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bob_x_share + alice_x_share" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "11" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bob_y_share + alice_y_share" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "3" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "((bob_y_share * 3) + (alice_y_share * 3)) % field" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [], + "source": [ + "def imul(a, scalar):\n", + " \n", + " # logic here which can multiply by a public scalar\n", + " \n", + " c = list()\n", + " \n", + " for i in range(len(a)):\n", + " c.append((a[i] * scalar) % Q)\n", + " \n", + " return tuple(c)" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[102429626723126324886479814859023395389,\n", + " 119121291581510607047487749044084540899,\n", + " 72422427170530315136477713877807808125]" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "x = encrypt(encode(5.5))\n", + "x" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [], + "source": [ + "z = imul(x, 3)" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "16.5" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "decode(decrypt(z))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Lesson: Encrypted Computation in PySyft" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [], + "source": [ + "import syft as sy\n", + "import torch as th\n", + "hook = sy.TorchHook(th)\n", + "from torch import nn, optim" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [], + "source": [ + "bob = sy.VirtualWorker(hook, id=\"bob\").add_worker(sy.local_worker)\n", + "alice = sy.VirtualWorker(hook, id=\"alice\").add_worker(sy.local_worker)\n", + "secure_worker = sy.VirtualWorker(hook, id=\"secure_worker\").add_worker(sy.local_worker)" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [], + "source": [ + "x = th.tensor([1,2,3,4])\n", + "y = th.tensor([2,-1,1,0])" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [], + "source": [ + "x = x.share(bob, alice, crypto_provider=secure_worker)" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [], + "source": [ + "y = y.share(bob, alice, crypto_provider=secure_worker)" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor([3, 1, 4, 4])" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "z = x + y\n", + "z.get()" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor([-1, 3, 2, 4])" + ] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "z = x - y\n", + "z.get()" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor([ 2, -2, 3, 0])" + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "z = x * y\n", + "z.get()" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor([0, 1, 1, 1])" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "z = x > y\n", + "z.get()" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor([1, 0, 0, 0])" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "z = x < y\n", + "z.get()" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor([0, 0, 0, 0])" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "z = x == y\n", + "z.get()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": {}, + "outputs": [ + { + "ename": "RuntimeError", + "evalue": "log2_vml_cpu not implemented for 'Long'", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mRuntimeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mth\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtensor\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfix_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshare\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbob\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0malice\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcrypto_provider\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0msecure_worker\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 5\u001b[0m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfix_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshare\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbob\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0malice\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcrypto_provider\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0msecure_worker\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/tensors/interpreters/native.py\u001b[0m in \u001b[0;36mfix_prec\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 643\u001b[0m \u001b[0mprec_fractional\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"precision_fractional\"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 644\u001b[0m \u001b[0mmax_precision\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_get_maximum_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 645\u001b[0;31m \u001b[0;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_requires_large_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmax_precision\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbase\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mprec_fractional\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 646\u001b[0m return (\n\u001b[1;32m 647\u001b[0m \u001b[0msyft\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mLargePrecisionTensor\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/tensors/interpreters/native.py\u001b[0m in \u001b[0;36m_requires_large_precision\u001b[0;34m(self, max_precision, base, precision_fractional)\u001b[0m\n\u001b[1;32m 666\u001b[0m \"\"\"\n\u001b[1;32m 667\u001b[0m \u001b[0mbase_fractional\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlog2\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbase\u001b[0m \u001b[0;34m**\u001b[0m \u001b[0mprecision_fractional\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 668\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0many\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mabs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlog2\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mbase_fractional\u001b[0m \u001b[0;34m>\u001b[0m \u001b[0mmax_precision\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 669\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 670\u001b[0m def share(\n", + "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/hook/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 661\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mBaseException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 662\u001b[0m \u001b[0;31m# we can make some errors more descriptive with this method\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 663\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mroute_method_exception\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 664\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 665\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;31m# means that there is a wrapper to remove\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/hook/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 655\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 656\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtuple\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 657\u001b[0;31m \u001b[0mresponse\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmethod\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 658\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 659\u001b[0m \u001b[0mresponse\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmethod\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mRuntimeError\u001b[0m: log2_vml_cpu not implemented for 'Long'" + ] + } + ], + "source": [ + "x = th.tensor([1,2,3,4])\n", + "y = th.tensor([2,-1,1,0])\n", + "\n", + "x = x.fix_precision().share(bob, alice, crypto_provider=secure_worker)\n", + "y = y.fix_precision().share(bob, alice, crypto_provider=secure_worker)" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": {}, + "outputs": [ + { + "ename": "AttributeError", + "evalue": "'Tensor' object has no attribute 'child'", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mz\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mz\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfloat_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/tensors/interpreters/native.py\u001b[0m in \u001b[0;36mget\u001b[0;34m(self, inplace, *args, **kwargs)\u001b[0m\n\u001b[1;32m 555\u001b[0m \u001b[0;31m# return self\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 556\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 557\u001b[0;31m \u001b[0mtensor\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchild\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 558\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 559\u001b[0m \u001b[0;31m# Clean the wrapper\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mAttributeError\u001b[0m: 'Tensor' object has no attribute 'child'" + ] + } + ], + "source": [ + "z = x + y\n", + "z.get().float_precision()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "z = x - y\n", + "z.get().float_precision()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "z = x * y\n", + "z.get().float_precision()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "z = x > y\n", + "z.get().float_precision()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "z = x < y\n", + "z.get().float_precision()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "z = x == y\n", + "z.get().float_precision()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Project: Build an Encrypted Database" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# try this project here!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Lesson: Encrypted Deep Learning in PyTorch" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Train a Model" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "tensor(0.9441)\n", + "tensor(1.0317)\n", + "tensor(1.5362)\n", + "tensor(1.5309)\n", + "tensor(1.6070)\n", + "tensor(0.4302)\n", + "tensor(0.1941)\n", + "tensor(0.0789)\n", + "tensor(0.0409)\n", + "tensor(0.0215)\n", + "tensor(0.0136)\n", + "tensor(0.0094)\n", + "tensor(0.0068)\n", + "tensor(0.0052)\n", + "tensor(0.0041)\n", + "tensor(0.0034)\n", + "tensor(0.0028)\n", + "tensor(0.0024)\n", + "tensor(0.0020)\n", + "tensor(0.0017)\n" + ] + } + ], + "source": [ + "from torch import nn\n", + "from torch import optim\n", + "import torch.nn.functional as F\n", + "\n", + "# A Toy Dataset\n", + "data = th.tensor([[0,0],[0,1],[1,0],[1,1.]], requires_grad=True)\n", + "target = th.tensor([[0],[0],[1],[1.]], requires_grad=True)\n", + "\n", + "class Net(nn.Module):\n", + " def __init__(self):\n", + " super(Net, self).__init__()\n", + " self.fc1 = nn.Linear(2, 20)\n", + " self.fc2 = nn.Linear(20, 1)\n", + "\n", + " def forward(self, x):\n", + " x = self.fc1(x)\n", + " x = F.relu(x)\n", + " x = self.fc2(x)\n", + " return x\n", + "\n", + "# A Toy Model\n", + "model = Net()\n", + "\n", + "def train():\n", + " # Training Logic\n", + " opt = optim.SGD(params=model.parameters(),lr=0.1)\n", + " for iter in range(20):\n", + "\n", + " # 1) erase previous gradients (if they exist)\n", + " opt.zero_grad()\n", + "\n", + " # 2) make a prediction\n", + " pred = model(data)\n", + "\n", + " # 3) calculate how much we missed\n", + " loss = ((pred - target)**2).sum()\n", + "\n", + " # 4) figure out which weights caused us to miss\n", + " loss.backward()\n", + "\n", + " # 5) change those weights\n", + " opt.step()\n", + "\n", + " # 6) print our progress\n", + " print(loss.data)\n", + " \n", + "train()" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor([[ 0.0140],\n", + " [-0.0166],\n", + " [ 0.9727],\n", + " [ 1.0159]], grad_fn=)" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "model(data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Encrypt the Model and Data" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": {}, + "outputs": [], + "source": [ + "encrypted_model = model.fix_precision().share(alice, bob, crypto_provider=secure_worker)" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[Parameter containing:\n", + " Parameter>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", + " \t-> (Wrapper)>[PointerTensor | me:85708028434 -> alice:81267402168]\n", + " \t-> (Wrapper)>[PointerTensor | me:62441055856 -> bob:99786602276]\n", + " \t*crypto provider: secure_worker*, Parameter containing:\n", + " Parameter>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", + " \t-> (Wrapper)>[PointerTensor | me:3433679299 -> alice:86383455980]\n", + " \t-> (Wrapper)>[PointerTensor | me:95582447050 -> bob:21797858764]\n", + " \t*crypto provider: secure_worker*, Parameter containing:\n", + " Parameter>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", + " \t-> (Wrapper)>[PointerTensor | me:10008001814 -> alice:42945910810]\n", + " \t-> (Wrapper)>[PointerTensor | me:71956775381 -> bob:2075238434]\n", + " \t*crypto provider: secure_worker*, Parameter containing:\n", + " Parameter>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", + " \t-> (Wrapper)>[PointerTensor | me:94360348394 -> alice:9472537528]\n", + " \t-> (Wrapper)>[PointerTensor | me:44614305427 -> bob:98758467549]\n", + " \t*crypto provider: secure_worker*]" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "list(encrypted_model.parameters())" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": {}, + "outputs": [], + "source": [ + "encrypted_data = data.fix_precision().share(alice, bob, crypto_provider=secure_worker)" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(Wrapper)>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", + "\t-> (Wrapper)>[PointerTensor | me:90895649049 -> alice:241475900]\n", + "\t-> (Wrapper)>[PointerTensor | me:45527513617 -> bob:69153393355]\n", + "\t*crypto provider: secure_worker*" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "encrypted_data" + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": {}, + "outputs": [], + "source": [ + "encrypted_prediction = encrypted_model(encrypted_data)" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor([[ 0.0130],\n", + " [-0.0160],\n", + " [ 0.9700],\n", + " [ 1.0140]])" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "encrypted_prediction.get().float_precision()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Lesson: Encrypted Deep Learning in Keras\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 1: Public Training\n", + "\n", + "Welcome to this tutorial! In the following notebooks you will learn how to provide private predictions. By private predictions, we mean that the data is constantly encrypted throughout the entire process. At no point is the user sharing raw data, only encrypted (that is, secret shared) data. In order to provide these private predictions, Syft Keras uses a library called [TF Encrypted](https://github.com/tf-encrypted/tf-encrypted) under the hood. TF Encrypted combines cutting-edge cryptographic and machine learning techniques, but you don't have to worry about this and can focus on your machine learning application.\n", + "\n", + "You can start serving private predictions with only three steps:\n", + "- **Step 1**: train your model with normal Keras.\n", + "- **Step 2**: secure and serve your machine learning model (server).\n", + "- **Step 3**: query the secured model to receive private predictions (client). \n", + "\n", + "Alright, let's go through these three steps so you can deploy impactful machine learning services without sacrificing user privacy or model security.\n", + "\n", + "Huge shoutout to the Dropout Labs ([@dropoutlabs](https://twitter.com/dropoutlabs)) and TF Encrypted ([@tf_encrypted](https://twitter.com/tf_encrypted)) teams for their great work which makes this demo possible, especially: Jason Mancuso ([@jvmancuso](https://twitter.com/jvmancuso)), Yann Dupis ([@YannDupis](https://twitter.com/YannDupis)), and Morten Dahl ([@mortendahlcs](https://github.com/mortendahlcs)). \n", + "\n", + "_Demo Ref: https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials_" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Train Your Model in Keras\n", + "\n", + "To use privacy-preserving machine learning techniques for your projects you should not have to learn a new machine learning framework. If you have basic [Keras](https://keras.io/) knowledge, you can start using these techniques with Syft Keras. If you have never used Keras before, you can learn a bit more about it through the [Keras documentation](https://keras.io). \n", + "\n", + "Before serving private predictions, the first step is to train your model with normal Keras. As an example, we will train a model to classify handwritten digits. To train this model we will use the canonical [MNIST dataset](http://yann.lecun.com/exdb/mnist/).\n", + "\n", + "We borrow [this example](https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py) from the reference Keras repository. To train your classification model, you just run the cell below." + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "x_train shape: (60000, 28, 28, 1)\n", + "60000 train samples\n", + "10000 test samples\n", + "WARNING:tensorflow:From /usr/local/miniconda3/envs/syft/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\n", + "Instructions for updating:\n", + "Colocations handled automatically by placer.\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "WARNING:tensorflow:From /usr/local/miniconda3/envs/syft/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\n", + "Instructions for updating:\n", + "Colocations handled automatically by placer.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Train on 60000 samples, validate on 10000 samples\n", + "WARNING:tensorflow:From /usr/local/miniconda3/envs/syft/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n", + "Instructions for updating:\n", + "Use tf.cast instead.\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "WARNING:tensorflow:From /usr/local/miniconda3/envs/syft/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n", + "Instructions for updating:\n", + "Use tf.cast instead.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Epoch 1/2\n", + "60000/60000 [==============================] - 26s 440us/sample - loss: 0.7004 - acc: 0.7795 - val_loss: 0.3245 - val_acc: 0.9011\n", + "Epoch 2/2\n", + "60000/60000 [==============================] - 22s 361us/sample - loss: 0.2265 - acc: 0.9311 - val_loss: 0.1698 - val_acc: 0.9487\n", + "Test loss: 0.1698406898036599\n", + "Test accuracy: 0.9487\n" + ] + } + ], + "source": [ + "from __future__ import print_function\n", + "import tensorflow.keras as keras\n", + "from tensorflow.keras.datasets import mnist\n", + "from tensorflow.keras.models import Sequential\n", + "from tensorflow.keras.layers import Dense, Dropout, Flatten\n", + "from tensorflow.keras.layers import Conv2D, AveragePooling2D\n", + "from tensorflow.keras.layers import Activation\n", + "\n", + "batch_size = 128\n", + "num_classes = 10\n", + "epochs = 2\n", + "\n", + "# input image dimensions\n", + "img_rows, img_cols = 28, 28\n", + "\n", + "# the data, split between train and test sets\n", + "(x_train, y_train), (x_test, y_test) = mnist.load_data()\n", + "\n", + "x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n", + "x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n", + "input_shape = (img_rows, img_cols, 1)\n", + "\n", + "x_train = x_train.astype('float32')\n", + "x_test = x_test.astype('float32')\n", + "x_train /= 255\n", + "x_test /= 255\n", + "print('x_train shape:', x_train.shape)\n", + "print(x_train.shape[0], 'train samples')\n", + "print(x_test.shape[0], 'test samples')\n", + "\n", + "# convert class vectors to binary class matrices\n", + "y_train = keras.utils.to_categorical(y_train, num_classes)\n", + "y_test = keras.utils.to_categorical(y_test, num_classes)\n", + "\n", + "model = Sequential()\n", + "\n", + "model.add(Conv2D(10, (3, 3), input_shape=input_shape))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Conv2D(32, (3, 3)))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Conv2D(64, (3, 3)))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Flatten())\n", + "model.add(Dense(num_classes, activation='softmax'))\n", + "\n", + "model.compile(loss=keras.losses.categorical_crossentropy,\n", + " optimizer=keras.optimizers.Adadelta(),\n", + " metrics=['accuracy'])\n", + "\n", + "model.fit(x_train, y_train,\n", + " batch_size=batch_size,\n", + " epochs=epochs,\n", + " verbose=1,\n", + " validation_data=(x_test, y_test))\n", + "score = model.evaluate(x_test, y_test, verbose=0)\n", + "print('Test loss:', score[0])\n", + "print('Test accuracy:', score[1])" + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": {}, + "outputs": [], + "source": [ + "## Save your model's weights for future private prediction\n", + "model.save('short-conv-mnist.h5')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 2: Load and Serve the Model" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that you have a trained model with normal Keras, you are ready to serve some private predictions. We can do that using Syft Keras.\n", + "\n", + "To secure and serve this model, we will need three TFEWorkers (servers). This is because TF Encrypted under the hood uses an encryption technique called [multi-party computation (MPC)](https://en.wikipedia.org/wiki/Secure_multi-party_computation). The idea is to split the model weights and input data into shares, then send a share of each value to the different servers. The key property is that if you look at the share on one server, it reveals nothing about the original value (input data or model weights).\n", + "\n", + "We'll define a Syft Keras model like we did in the previous notebook. However, there is a trick: before instantiating this model, we'll run `hook = sy.KerasHook(tf.keras)`. This will add three important new methods to the Keras Sequential class:\n", + " - `share`: will secure your model via secret sharing; by default, it will use the SecureNN protocol from TF Encrypted to secret share your model between each of the three TFEWorkers. Most importantly, this will add the capability of providing predictions on encrypted data.\n", + " - `serve`: this function will launch a serving queue, so that the TFEWorkers can can accept prediction requests on the secured model from external clients.\n", + " - `shutdown_workers`: once you are done providing private predictions, you can shut down your model by running this function. It will direct you to shutdown the server processes manually if you've opted to manually manage each worker.\n", + "\n", + "If you want learn more about MPC, you can read this excellent [blog](https://mortendahl.github.io/2017/04/17/private-deep-learning-with-mpc/)." + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "import tensorflow as tf\n", + "from tensorflow.keras import Sequential\n", + "from tensorflow.keras.layers import AveragePooling2D, Conv2D, Dense, Activation, Flatten, ReLU, Activation\n", + "\n", + "import syft as sy\n", + "hook = sy.KerasHook(tf.keras)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Model\n", + "\n", + "As you can see, we define almost the exact same model as before, except we provide a `batch_input_shape`. This allows TF Encrypted to better optimize the secure computations via predefined tensor shapes. For this MNIST demo, we'll send input data with the shape of (1, 28, 28, 1). \n", + "We also return the logit instead of softmax because this operation is complex to perform using MPC, and we don't need it to serve prediction requests." + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": {}, + "outputs": [], + "source": [ + "num_classes = 10\n", + "input_shape = (1, 28, 28, 1)\n", + "\n", + "model = Sequential()\n", + "\n", + "model.add(Conv2D(10, (3, 3), batch_input_shape=input_shape))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Conv2D(32, (3, 3)))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Conv2D(64, (3, 3)))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Flatten())\n", + "model.add(Dense(num_classes, name=\"logit\"))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Load Pre-trained Weights\n", + "\n", + "With `load_weights` you can easily load the weights you have saved previously after training your model." + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "metadata": {}, + "outputs": [], + "source": [ + "pre_trained_weights = 'short-conv-mnist.h5'\n", + "model.load_weights(pre_trained_weights)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 3: Setup Your Worker Connectors\n", + "\n", + "Let's now connect to the TFEWorkers (`alice`, `bob`, and `carol`) required by TF Encrypted to perform private predictions. For each TFEWorker, you just have to specify a host. We then make combine these workers in a cluster.\n", + "\n", + "These workers run a [TensorFlow server](https://www.tensorflow.org/api_docs/python/tf/distribute/Server), which you can either manage manually (`AUTO = False`) or ask the workers to manage for you (`AUTO = True`). If choosing to manually manage them, you will be instructed to execute a terminal command on each worker's host device after calling `model.share()` below. If all workers are hosted on a single device (e.g. `localhost`), you can choose to have Syft automatically manage the worker's TensorFlow server." + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO:tf_encrypted:If not done already, please launch the following command in a terminal on host localhost:4000: 'python -m tf_encrypted.player --config /var/folders/mh/7hh_sz1d3532_k9w4kcmnt8c0000gn/T/tfe.config server0'\n", + "This can be done automatically in a local subprocess by setting `auto_managed=True` when instantiating a TFEWorker.\n", + "\n", + "INFO:tf_encrypted:If not done already, please launch the following command in a terminal on host localhost:4001: 'python -m tf_encrypted.player --config /var/folders/mh/7hh_sz1d3532_k9w4kcmnt8c0000gn/T/tfe.config server1'\n", + "This can be done automatically in a local subprocess by setting `auto_managed=True` when instantiating a TFEWorker.\n", + "\n", + "INFO:tf_encrypted:If not done already, please launch the following command in a terminal on host localhost:4002: 'python -m tf_encrypted.player --config /var/folders/mh/7hh_sz1d3532_k9w4kcmnt8c0000gn/T/tfe.config server2'\n", + "This can be done automatically in a local subprocess by setting `auto_managed=True` when instantiating a TFEWorker.\n", + "\n" + ] + } + ], + "source": [ + "AUTO = False\n", + "\n", + "alice = sy.TFEWorker(host='localhost:4000', auto_managed=AUTO)\n", + "bob = sy.TFEWorker(host='localhost:4001', auto_managed=AUTO)\n", + "carol = sy.TFEWorker(host='localhost:4002', auto_managed=AUTO)\n", + "\n", + "cluster = sy.TFECluster(alice, bob, carol)\n", + "cluster.start()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 4: Launch 3 Servers\n", + "\n", + "If you have chosen to manually control the workers (i.e. `AUTO = False`) then you now need to launch 3 servers. Look for the exact commands to run in the info messages printed above. You are looking for the following, where the `...` are actual file paths:\n", + "\n", + "- `python -m tf_encrypted.player --config ... server0`\n", + "- `python -m tf_encrypted.player --config ... server1`\n", + "- `python -m tf_encrypted.player --config ... server2`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 5: Split the Model Into Shares\n", + "\n", + "Thanks to `sy.KerasHook(tf.keras)` you can call the `share` method to transform your model into a TF Encrypted Keras model.\n", + "\n", + "If you have asked to manually manage servers above then this step will not complete until they have all been launched. Note that your firewall may ask for Python to accept incoming connection." + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO:tf_encrypted:Starting session on target 'grpc://localhost:4000' using config graph_options {\n", + "}\n", + "\n" + ] + } + ], + "source": [ + "model.share(cluster)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 6: Serve the Model\n", + "\n", + "Perfect! Now by calling `model.serve`, your model is ready to provide some private predictions. You can set `num_requests` to set a limit on the number of predictions requests served by the model; if not specified then the model will be served until interrupted." + ] + }, + { + "cell_type": "code", + "execution_count": 52, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Served encrypted prediction 1 to client.\n", + "Served encrypted prediction 2 to client.\n", + "Served encrypted prediction 3 to client.\n" + ] + } + ], + "source": [ + "model.serve(num_requests=3)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 7: Run the Client\n", + "\n", + "At this point open up and run the companion notebook: Section 4b - Encrytped Keras Client" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 8: Shutdown the Servers\n", + "\n", + "Once your request limit above, the model will no longer be available for serving requests, but it's still secret shared between the three workers above. You can kill the workers by executing the cell below.\n", + "\n", + "**Congratulations** on finishing Part 12: Secure Classification with Syft Keras and TFE!" + ] + }, + { + "cell_type": "code", + "execution_count": 53, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO:tf_encrypted:Please terminate the process on host 'localhost:4000'.\n", + "INFO:tf_encrypted:Please terminate the process on host 'localhost:4001'.\n", + "INFO:tf_encrypted:Please terminate the process on host 'localhost:4002'.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Process ID 15442 has been killed.\n", + "Process ID 15438 has been killed.\n", + "Process ID 15432 has been killed.\n" + ] + } + ], + "source": [ + "model.stop()\n", + "cluster.stop()\n", + "\n", + "if not AUTO:\n", + " process_ids = !ps aux | grep '[p]ython -m tf_encrypted.player --config' | awk '{print $2}'\n", + " for process_id in process_ids:\n", + " !kill {process_id}\n", + " print(\"Process ID {id} has been killed.\".format(id=process_id))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Keystone Project - Mix and Match What You've Learned\n", + "\n", + "Description: Take two of the concepts you've learned about in this course (Encrypted Computation, Federated Learning, Differential Privacy) and combine them for a use case of your own design. Extra credit if you can get your demo working with [WebSocketWorkers](https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials/advanced/websockets-example-MNIST) instead of VirtualWorkers! Then take your demo or example application, write a blogpost, and share that blogpost in #general-discussion on OpenMined's slack!!!\n", + "\n", + "Inspiration:\n", + "- This Course's Code: https://github.com/Udacity/private-ai\n", + "- OpenMined's Tutorials: https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials\n", + "- OpenMined's Blog: https://blog.openmined.org" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.8" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} From 831f8b106437431ea6cf782282745c49a80a53d9 Mon Sep 17 00:00:00 2001 From: Nimanshi Jha <40792134+nimanshijha@users.noreply.github.com> Date: Mon, 26 Aug 2019 22:01:21 +0530 Subject: [PATCH 11/14] Delete Section 4 - Encrypted Deep Learning.ipynb --- Section 4 - Encrypted Deep Learning.ipynb | 1680 --------------------- 1 file changed, 1680 deletions(-) delete mode 100644 Section 4 - Encrypted Deep Learning.ipynb diff --git a/Section 4 - Encrypted Deep Learning.ipynb b/Section 4 - Encrypted Deep Learning.ipynb deleted file mode 100644 index a396bc6..0000000 --- a/Section 4 - Encrypted Deep Learning.ipynb +++ /dev/null @@ -1,1680 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Section: Encrypted Deep Learning\n", - "\n", - "- Lesson: Reviewing Additive Secret Sharing\n", - "- Lesson: Encrypted Subtraction and Public/Scalar Multiplication\n", - "- Lesson: Encrypted Computation in PySyft\n", - "- Project: Build an Encrypted Database\n", - "- Lesson: Encrypted Deep Learning in PyTorch\n", - "- Lesson: Encrypted Deep Learning in Keras\n", - "- Final Project" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Reviewing Additive Secret Sharing\n", - "\n", - "_For more great information about SMPC protocols like this one, visit https://mortendahl.github.io. With permission, Morten's work directly inspired this first teaching segment._" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import random\n", - "import numpy as np\n", - "\n", - "BASE = 10\n", - "\n", - "PRECISION_INTEGRAL = 8\n", - "PRECISION_FRACTIONAL = 8\n", - "Q = 293973345475167247070445277780365744413\n", - "\n", - "PRECISION = PRECISION_INTEGRAL + PRECISION_FRACTIONAL\n", - "\n", - "assert(Q > BASE**PRECISION)\n", - "\n", - "def encode(rational):\n", - " upscaled = int(rational * BASE**PRECISION_FRACTIONAL)\n", - " field_element = upscaled % Q\n", - " return field_element\n", - "\n", - "def decode(field_element):\n", - " upscaled = field_element if field_element <= Q/2 else field_element - Q\n", - " rational = upscaled / BASE**PRECISION_FRACTIONAL\n", - " return rational\n", - "\n", - "def encrypt(secret):\n", - " first = random.randrange(Q)\n", - " second = random.randrange(Q)\n", - " third = (secret - first - second) % Q\n", - " return [first, second, third]\n", - "\n", - "def decrypt(sharing):\n", - " return sum(sharing) % Q\n", - "\n", - "def add(a, b):\n", - " c = list()\n", - " for i in range(len(a)):\n", - " c.append((a[i] + b[i]) % Q)\n", - " return tuple(c)" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[157261321043261917823065844237599434634,\n", - " 132977400494673236547132219369279939003,\n", - " 3734623937232092700247214174036370776]" - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x = encrypt(encode(5.5))\n", - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[154234279162428625260264272011868616963,\n", - " 160039428260900034921952397194418692688,\n", - " 273672983527005833958673886354674179174]" - ] - }, - "execution_count": 3, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "y = encrypt(encode(2.3))\n", - "y" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(17522254730523296012884838469102307184,\n", - " 293016828755573271469084616563698631691,\n", - " 277407607464237926658921100528710549950)" - ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = add(x,y)\n", - "z" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "7.79999999" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "decode(decrypt(z))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Encrypted Subtraction and Public/Scalar Multiplication" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [], - "source": [ - "field = 23740629843760239486723" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [], - "source": [ - "x = 5\n", - "\n", - "bob_x_share = 2372385723 # random number\n", - "alices_x_share = field - bob_x_share + x" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "5" - ] - }, - "execution_count": 8, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "(bob_x_share + alices_x_share) % field" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [], - "source": [ - "field = 10\n", - "\n", - "x = 5\n", - "\n", - "bob_x_share = 8\n", - "alice_x_share = field - bob_x_share + x\n", - "\n", - "y = 1\n", - "\n", - "bob_y_share = 9\n", - "alice_y_share = field - bob_y_share + y" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "4" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "((bob_x_share + alice_x_share) - (bob_y_share + alice_y_share)) % field" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "4" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "((bob_x_share - bob_y_share) + (alice_x_share - alice_y_share)) % field" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "26" - ] - }, - "execution_count": 12, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob_x_share + alice_x_share + bob_y_share + alice_y_share" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [], - "source": [ - "bob_z_share = (bob_x_share - bob_y_share)\n", - "alice_z_share = (alice_x_share - alice_y_share)" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "4" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "(bob_z_share + alice_z_share) % field" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "def sub(a, b):\n", - " c = list()\n", - " for i in range(len(a)):\n", - " c.append((a[i] - b[i]) % Q)\n", - " return tuple(c)" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [], - "source": [ - "field = 10\n", - "\n", - "x = 5\n", - "\n", - "bob_x_share = 8\n", - "alice_x_share = field - bob_x_share + x\n", - "\n", - "y = 1\n", - "\n", - "bob_y_share = 9\n", - "alice_y_share = field - bob_y_share + y" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "15" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob_x_share + alice_x_share" - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "11" - ] - }, - "execution_count": 18, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "bob_y_share + alice_y_share" - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "3" - ] - }, - "execution_count": 19, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "((bob_y_share * 3) + (alice_y_share * 3)) % field" - ] - }, - { - "cell_type": "code", - "execution_count": 20, - "metadata": {}, - "outputs": [], - "source": [ - "def imul(a, scalar):\n", - " \n", - " # logic here which can multiply by a public scalar\n", - " \n", - " c = list()\n", - " \n", - " for i in range(len(a)):\n", - " c.append((a[i] * scalar) % Q)\n", - " \n", - " return tuple(c)" - ] - }, - { - "cell_type": "code", - "execution_count": 21, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[102429626723126324886479814859023395389,\n", - " 119121291581510607047487749044084540899,\n", - " 72422427170530315136477713877807808125]" - ] - }, - "execution_count": 21, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "x = encrypt(encode(5.5))\n", - "x" - ] - }, - { - "cell_type": "code", - "execution_count": 22, - "metadata": {}, - "outputs": [], - "source": [ - "z = imul(x, 3)" - ] - }, - { - "cell_type": "code", - "execution_count": 23, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "16.5" - ] - }, - "execution_count": 23, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "decode(decrypt(z))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Encrypted Computation in PySyft" - ] - }, - { - "cell_type": "code", - "execution_count": 24, - "metadata": {}, - "outputs": [], - "source": [ - "import syft as sy\n", - "import torch as th\n", - "hook = sy.TorchHook(th)\n", - "from torch import nn, optim" - ] - }, - { - "cell_type": "code", - "execution_count": 25, - "metadata": {}, - "outputs": [], - "source": [ - "bob = sy.VirtualWorker(hook, id=\"bob\").add_worker(sy.local_worker)\n", - "alice = sy.VirtualWorker(hook, id=\"alice\").add_worker(sy.local_worker)\n", - "secure_worker = sy.VirtualWorker(hook, id=\"secure_worker\").add_worker(sy.local_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 26, - "metadata": {}, - "outputs": [], - "source": [ - "x = th.tensor([1,2,3,4])\n", - "y = th.tensor([2,-1,1,0])" - ] - }, - { - "cell_type": "code", - "execution_count": 27, - "metadata": {}, - "outputs": [], - "source": [ - "x = x.share(bob, alice, crypto_provider=secure_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 28, - "metadata": {}, - "outputs": [], - "source": [ - "y = y.share(bob, alice, crypto_provider=secure_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 29, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([3, 1, 4, 4])" - ] - }, - "execution_count": 29, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = x + y\n", - "z.get()" - ] - }, - { - "cell_type": "code", - "execution_count": 30, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([-1, 3, 2, 4])" - ] - }, - "execution_count": 30, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = x - y\n", - "z.get()" - ] - }, - { - "cell_type": "code", - "execution_count": 31, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([ 2, -2, 3, 0])" - ] - }, - "execution_count": 31, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = x * y\n", - "z.get()" - ] - }, - { - "cell_type": "code", - "execution_count": 32, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([0, 1, 1, 1])" - ] - }, - "execution_count": 32, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = x > y\n", - "z.get()" - ] - }, - { - "cell_type": "code", - "execution_count": 33, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([1, 0, 0, 0])" - ] - }, - "execution_count": 33, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = x < y\n", - "z.get()" - ] - }, - { - "cell_type": "code", - "execution_count": 34, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([0, 0, 0, 0])" - ] - }, - "execution_count": 34, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "z = x == y\n", - "z.get()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 35, - "metadata": {}, - "outputs": [ - { - "ename": "RuntimeError", - "evalue": "log2_vml_cpu not implemented for 'Long'", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mRuntimeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mth\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtensor\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfix_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshare\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbob\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0malice\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcrypto_provider\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0msecure_worker\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 5\u001b[0m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfix_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshare\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbob\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0malice\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcrypto_provider\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0msecure_worker\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/tensors/interpreters/native.py\u001b[0m in \u001b[0;36mfix_prec\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 643\u001b[0m \u001b[0mprec_fractional\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"precision_fractional\"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 644\u001b[0m \u001b[0mmax_precision\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_get_maximum_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 645\u001b[0;31m \u001b[0;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_requires_large_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmax_precision\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbase\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mprec_fractional\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 646\u001b[0m return (\n\u001b[1;32m 647\u001b[0m \u001b[0msyft\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mLargePrecisionTensor\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/tensors/interpreters/native.py\u001b[0m in \u001b[0;36m_requires_large_precision\u001b[0;34m(self, max_precision, base, precision_fractional)\u001b[0m\n\u001b[1;32m 666\u001b[0m \"\"\"\n\u001b[1;32m 667\u001b[0m \u001b[0mbase_fractional\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlog2\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbase\u001b[0m \u001b[0;34m**\u001b[0m \u001b[0mprecision_fractional\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 668\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0many\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mabs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlog2\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mbase_fractional\u001b[0m \u001b[0;34m>\u001b[0m \u001b[0mmax_precision\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 669\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 670\u001b[0m def share(\n", - "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/hook/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 661\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mBaseException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 662\u001b[0m \u001b[0;31m# we can make some errors more descriptive with this method\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 663\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mroute_method_exception\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 664\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 665\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;31m# means that there is a wrapper to remove\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/hook/hook.py\u001b[0m in \u001b[0;36moverloaded_native_method\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 655\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 656\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtuple\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 657\u001b[0;31m \u001b[0mresponse\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmethod\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 658\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 659\u001b[0m \u001b[0mresponse\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmethod\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mRuntimeError\u001b[0m: log2_vml_cpu not implemented for 'Long'" - ] - } - ], - "source": [ - "x = th.tensor([1,2,3,4])\n", - "y = th.tensor([2,-1,1,0])\n", - "\n", - "x = x.fix_precision().share(bob, alice, crypto_provider=secure_worker)\n", - "y = y.fix_precision().share(bob, alice, crypto_provider=secure_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 36, - "metadata": {}, - "outputs": [ - { - "ename": "AttributeError", - "evalue": "'Tensor' object has no attribute 'child'", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mz\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mz\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfloat_precision\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;32m~/work/dropoutlabs/PySyft/syft/frameworks/torch/tensors/interpreters/native.py\u001b[0m in \u001b[0;36mget\u001b[0;34m(self, inplace, *args, **kwargs)\u001b[0m\n\u001b[1;32m 555\u001b[0m \u001b[0;31m# return self\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 556\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 557\u001b[0;31m \u001b[0mtensor\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchild\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 558\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 559\u001b[0m \u001b[0;31m# Clean the wrapper\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mAttributeError\u001b[0m: 'Tensor' object has no attribute 'child'" - ] - } - ], - "source": [ - "z = x + y\n", - "z.get().float_precision()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "z = x - y\n", - "z.get().float_precision()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "z = x * y\n", - "z.get().float_precision()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "z = x > y\n", - "z.get().float_precision()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "z = x < y\n", - "z.get().float_precision()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "z = x == y\n", - "z.get().float_precision()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project: Build an Encrypted Database" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# try this project here!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Encrypted Deep Learning in PyTorch" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Train a Model" - ] - }, - { - "cell_type": "code", - "execution_count": 37, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "tensor(0.9441)\n", - "tensor(1.0317)\n", - "tensor(1.5362)\n", - "tensor(1.5309)\n", - "tensor(1.6070)\n", - "tensor(0.4302)\n", - "tensor(0.1941)\n", - "tensor(0.0789)\n", - "tensor(0.0409)\n", - "tensor(0.0215)\n", - "tensor(0.0136)\n", - "tensor(0.0094)\n", - "tensor(0.0068)\n", - "tensor(0.0052)\n", - "tensor(0.0041)\n", - "tensor(0.0034)\n", - "tensor(0.0028)\n", - "tensor(0.0024)\n", - "tensor(0.0020)\n", - "tensor(0.0017)\n" - ] - } - ], - "source": [ - "from torch import nn\n", - "from torch import optim\n", - "import torch.nn.functional as F\n", - "\n", - "# A Toy Dataset\n", - "data = th.tensor([[0,0],[0,1],[1,0],[1,1.]], requires_grad=True)\n", - "target = th.tensor([[0],[0],[1],[1.]], requires_grad=True)\n", - "\n", - "class Net(nn.Module):\n", - " def __init__(self):\n", - " super(Net, self).__init__()\n", - " self.fc1 = nn.Linear(2, 20)\n", - " self.fc2 = nn.Linear(20, 1)\n", - "\n", - " def forward(self, x):\n", - " x = self.fc1(x)\n", - " x = F.relu(x)\n", - " x = self.fc2(x)\n", - " return x\n", - "\n", - "# A Toy Model\n", - "model = Net()\n", - "\n", - "def train():\n", - " # Training Logic\n", - " opt = optim.SGD(params=model.parameters(),lr=0.1)\n", - " for iter in range(20):\n", - "\n", - " # 1) erase previous gradients (if they exist)\n", - " opt.zero_grad()\n", - "\n", - " # 2) make a prediction\n", - " pred = model(data)\n", - "\n", - " # 3) calculate how much we missed\n", - " loss = ((pred - target)**2).sum()\n", - "\n", - " # 4) figure out which weights caused us to miss\n", - " loss.backward()\n", - "\n", - " # 5) change those weights\n", - " opt.step()\n", - "\n", - " # 6) print our progress\n", - " print(loss.data)\n", - " \n", - "train()" - ] - }, - { - "cell_type": "code", - "execution_count": 38, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([[ 0.0140],\n", - " [-0.0166],\n", - " [ 0.9727],\n", - " [ 1.0159]], grad_fn=)" - ] - }, - "execution_count": 38, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "model(data)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Encrypt the Model and Data" - ] - }, - { - "cell_type": "code", - "execution_count": 39, - "metadata": {}, - "outputs": [], - "source": [ - "encrypted_model = model.fix_precision().share(alice, bob, crypto_provider=secure_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 40, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[Parameter containing:\n", - " Parameter>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", - " \t-> (Wrapper)>[PointerTensor | me:85708028434 -> alice:81267402168]\n", - " \t-> (Wrapper)>[PointerTensor | me:62441055856 -> bob:99786602276]\n", - " \t*crypto provider: secure_worker*, Parameter containing:\n", - " Parameter>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", - " \t-> (Wrapper)>[PointerTensor | me:3433679299 -> alice:86383455980]\n", - " \t-> (Wrapper)>[PointerTensor | me:95582447050 -> bob:21797858764]\n", - " \t*crypto provider: secure_worker*, Parameter containing:\n", - " Parameter>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", - " \t-> (Wrapper)>[PointerTensor | me:10008001814 -> alice:42945910810]\n", - " \t-> (Wrapper)>[PointerTensor | me:71956775381 -> bob:2075238434]\n", - " \t*crypto provider: secure_worker*, Parameter containing:\n", - " Parameter>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", - " \t-> (Wrapper)>[PointerTensor | me:94360348394 -> alice:9472537528]\n", - " \t-> (Wrapper)>[PointerTensor | me:44614305427 -> bob:98758467549]\n", - " \t*crypto provider: secure_worker*]" - ] - }, - "execution_count": 40, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list(encrypted_model.parameters())" - ] - }, - { - "cell_type": "code", - "execution_count": 41, - "metadata": {}, - "outputs": [], - "source": [ - "encrypted_data = data.fix_precision().share(alice, bob, crypto_provider=secure_worker)" - ] - }, - { - "cell_type": "code", - "execution_count": 42, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(Wrapper)>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]\n", - "\t-> (Wrapper)>[PointerTensor | me:90895649049 -> alice:241475900]\n", - "\t-> (Wrapper)>[PointerTensor | me:45527513617 -> bob:69153393355]\n", - "\t*crypto provider: secure_worker*" - ] - }, - "execution_count": 42, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "encrypted_data" - ] - }, - { - "cell_type": "code", - "execution_count": 43, - "metadata": {}, - "outputs": [], - "source": [ - "encrypted_prediction = encrypted_model(encrypted_data)" - ] - }, - { - "cell_type": "code", - "execution_count": 44, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "tensor([[ 0.0130],\n", - " [-0.0160],\n", - " [ 0.9700],\n", - " [ 1.0140]])" - ] - }, - "execution_count": 44, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "encrypted_prediction.get().float_precision()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Lesson: Encrypted Deep Learning in Keras\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 1: Public Training\n", - "\n", - "Welcome to this tutorial! In the following notebooks you will learn how to provide private predictions. By private predictions, we mean that the data is constantly encrypted throughout the entire process. At no point is the user sharing raw data, only encrypted (that is, secret shared) data. In order to provide these private predictions, Syft Keras uses a library called [TF Encrypted](https://github.com/tf-encrypted/tf-encrypted) under the hood. TF Encrypted combines cutting-edge cryptographic and machine learning techniques, but you don't have to worry about this and can focus on your machine learning application.\n", - "\n", - "You can start serving private predictions with only three steps:\n", - "- **Step 1**: train your model with normal Keras.\n", - "- **Step 2**: secure and serve your machine learning model (server).\n", - "- **Step 3**: query the secured model to receive private predictions (client). \n", - "\n", - "Alright, let's go through these three steps so you can deploy impactful machine learning services without sacrificing user privacy or model security.\n", - "\n", - "Huge shoutout to the Dropout Labs ([@dropoutlabs](https://twitter.com/dropoutlabs)) and TF Encrypted ([@tf_encrypted](https://twitter.com/tf_encrypted)) teams for their great work which makes this demo possible, especially: Jason Mancuso ([@jvmancuso](https://twitter.com/jvmancuso)), Yann Dupis ([@YannDupis](https://twitter.com/YannDupis)), and Morten Dahl ([@mortendahlcs](https://github.com/mortendahlcs)). \n", - "\n", - "_Demo Ref: https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials_" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Train Your Model in Keras\n", - "\n", - "To use privacy-preserving machine learning techniques for your projects you should not have to learn a new machine learning framework. If you have basic [Keras](https://keras.io/) knowledge, you can start using these techniques with Syft Keras. If you have never used Keras before, you can learn a bit more about it through the [Keras documentation](https://keras.io). \n", - "\n", - "Before serving private predictions, the first step is to train your model with normal Keras. As an example, we will train a model to classify handwritten digits. To train this model we will use the canonical [MNIST dataset](http://yann.lecun.com/exdb/mnist/).\n", - "\n", - "We borrow [this example](https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py) from the reference Keras repository. To train your classification model, you just run the cell below." - ] - }, - { - "cell_type": "code", - "execution_count": 45, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "x_train shape: (60000, 28, 28, 1)\n", - "60000 train samples\n", - "10000 test samples\n", - "WARNING:tensorflow:From /usr/local/miniconda3/envs/syft/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\n", - "Instructions for updating:\n", - "Colocations handled automatically by placer.\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "WARNING:tensorflow:From /usr/local/miniconda3/envs/syft/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\n", - "Instructions for updating:\n", - "Colocations handled automatically by placer.\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Train on 60000 samples, validate on 10000 samples\n", - "WARNING:tensorflow:From /usr/local/miniconda3/envs/syft/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n", - "Instructions for updating:\n", - "Use tf.cast instead.\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "WARNING:tensorflow:From /usr/local/miniconda3/envs/syft/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n", - "Instructions for updating:\n", - "Use tf.cast instead.\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Epoch 1/2\n", - "60000/60000 [==============================] - 26s 440us/sample - loss: 0.7004 - acc: 0.7795 - val_loss: 0.3245 - val_acc: 0.9011\n", - "Epoch 2/2\n", - "60000/60000 [==============================] - 22s 361us/sample - loss: 0.2265 - acc: 0.9311 - val_loss: 0.1698 - val_acc: 0.9487\n", - "Test loss: 0.1698406898036599\n", - "Test accuracy: 0.9487\n" - ] - } - ], - "source": [ - "from __future__ import print_function\n", - "import tensorflow.keras as keras\n", - "from tensorflow.keras.datasets import mnist\n", - "from tensorflow.keras.models import Sequential\n", - "from tensorflow.keras.layers import Dense, Dropout, Flatten\n", - "from tensorflow.keras.layers import Conv2D, AveragePooling2D\n", - "from tensorflow.keras.layers import Activation\n", - "\n", - "batch_size = 128\n", - "num_classes = 10\n", - "epochs = 2\n", - "\n", - "# input image dimensions\n", - "img_rows, img_cols = 28, 28\n", - "\n", - "# the data, split between train and test sets\n", - "(x_train, y_train), (x_test, y_test) = mnist.load_data()\n", - "\n", - "x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n", - "x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n", - "input_shape = (img_rows, img_cols, 1)\n", - "\n", - "x_train = x_train.astype('float32')\n", - "x_test = x_test.astype('float32')\n", - "x_train /= 255\n", - "x_test /= 255\n", - "print('x_train shape:', x_train.shape)\n", - "print(x_train.shape[0], 'train samples')\n", - "print(x_test.shape[0], 'test samples')\n", - "\n", - "# convert class vectors to binary class matrices\n", - "y_train = keras.utils.to_categorical(y_train, num_classes)\n", - "y_test = keras.utils.to_categorical(y_test, num_classes)\n", - "\n", - "model = Sequential()\n", - "\n", - "model.add(Conv2D(10, (3, 3), input_shape=input_shape))\n", - "model.add(AveragePooling2D((2, 2)))\n", - "model.add(Activation('relu'))\n", - "model.add(Conv2D(32, (3, 3)))\n", - "model.add(AveragePooling2D((2, 2)))\n", - "model.add(Activation('relu'))\n", - "model.add(Conv2D(64, (3, 3)))\n", - "model.add(AveragePooling2D((2, 2)))\n", - "model.add(Activation('relu'))\n", - "model.add(Flatten())\n", - "model.add(Dense(num_classes, activation='softmax'))\n", - "\n", - "model.compile(loss=keras.losses.categorical_crossentropy,\n", - " optimizer=keras.optimizers.Adadelta(),\n", - " metrics=['accuracy'])\n", - "\n", - "model.fit(x_train, y_train,\n", - " batch_size=batch_size,\n", - " epochs=epochs,\n", - " verbose=1,\n", - " validation_data=(x_test, y_test))\n", - "score = model.evaluate(x_test, y_test, verbose=0)\n", - "print('Test loss:', score[0])\n", - "print('Test accuracy:', score[1])" - ] - }, - { - "cell_type": "code", - "execution_count": 46, - "metadata": {}, - "outputs": [], - "source": [ - "## Save your model's weights for future private prediction\n", - "model.save('short-conv-mnist.h5')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 2: Load and Serve the Model" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now that you have a trained model with normal Keras, you are ready to serve some private predictions. We can do that using Syft Keras.\n", - "\n", - "To secure and serve this model, we will need three TFEWorkers (servers). This is because TF Encrypted under the hood uses an encryption technique called [multi-party computation (MPC)](https://en.wikipedia.org/wiki/Secure_multi-party_computation). The idea is to split the model weights and input data into shares, then send a share of each value to the different servers. The key property is that if you look at the share on one server, it reveals nothing about the original value (input data or model weights).\n", - "\n", - "We'll define a Syft Keras model like we did in the previous notebook. However, there is a trick: before instantiating this model, we'll run `hook = sy.KerasHook(tf.keras)`. This will add three important new methods to the Keras Sequential class:\n", - " - `share`: will secure your model via secret sharing; by default, it will use the SecureNN protocol from TF Encrypted to secret share your model between each of the three TFEWorkers. Most importantly, this will add the capability of providing predictions on encrypted data.\n", - " - `serve`: this function will launch a serving queue, so that the TFEWorkers can can accept prediction requests on the secured model from external clients.\n", - " - `shutdown_workers`: once you are done providing private predictions, you can shut down your model by running this function. It will direct you to shutdown the server processes manually if you've opted to manually manage each worker.\n", - "\n", - "If you want learn more about MPC, you can read this excellent [blog](https://mortendahl.github.io/2017/04/17/private-deep-learning-with-mpc/)." - ] - }, - { - "cell_type": "code", - "execution_count": 47, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import tensorflow as tf\n", - "from tensorflow.keras import Sequential\n", - "from tensorflow.keras.layers import AveragePooling2D, Conv2D, Dense, Activation, Flatten, ReLU, Activation\n", - "\n", - "import syft as sy\n", - "hook = sy.KerasHook(tf.keras)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Model\n", - "\n", - "As you can see, we define almost the exact same model as before, except we provide a `batch_input_shape`. This allows TF Encrypted to better optimize the secure computations via predefined tensor shapes. For this MNIST demo, we'll send input data with the shape of (1, 28, 28, 1). \n", - "We also return the logit instead of softmax because this operation is complex to perform using MPC, and we don't need it to serve prediction requests." - ] - }, - { - "cell_type": "code", - "execution_count": 48, - "metadata": {}, - "outputs": [], - "source": [ - "num_classes = 10\n", - "input_shape = (1, 28, 28, 1)\n", - "\n", - "model = Sequential()\n", - "\n", - "model.add(Conv2D(10, (3, 3), batch_input_shape=input_shape))\n", - "model.add(AveragePooling2D((2, 2)))\n", - "model.add(Activation('relu'))\n", - "model.add(Conv2D(32, (3, 3)))\n", - "model.add(AveragePooling2D((2, 2)))\n", - "model.add(Activation('relu'))\n", - "model.add(Conv2D(64, (3, 3)))\n", - "model.add(AveragePooling2D((2, 2)))\n", - "model.add(Activation('relu'))\n", - "model.add(Flatten())\n", - "model.add(Dense(num_classes, name=\"logit\"))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Load Pre-trained Weights\n", - "\n", - "With `load_weights` you can easily load the weights you have saved previously after training your model." - ] - }, - { - "cell_type": "code", - "execution_count": 49, - "metadata": {}, - "outputs": [], - "source": [ - "pre_trained_weights = 'short-conv-mnist.h5'\n", - "model.load_weights(pre_trained_weights)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 3: Setup Your Worker Connectors\n", - "\n", - "Let's now connect to the TFEWorkers (`alice`, `bob`, and `carol`) required by TF Encrypted to perform private predictions. For each TFEWorker, you just have to specify a host. We then make combine these workers in a cluster.\n", - "\n", - "These workers run a [TensorFlow server](https://www.tensorflow.org/api_docs/python/tf/distribute/Server), which you can either manage manually (`AUTO = False`) or ask the workers to manage for you (`AUTO = True`). If choosing to manually manage them, you will be instructed to execute a terminal command on each worker's host device after calling `model.share()` below. If all workers are hosted on a single device (e.g. `localhost`), you can choose to have Syft automatically manage the worker's TensorFlow server." - ] - }, - { - "cell_type": "code", - "execution_count": 50, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "INFO:tf_encrypted:If not done already, please launch the following command in a terminal on host localhost:4000: 'python -m tf_encrypted.player --config /var/folders/mh/7hh_sz1d3532_k9w4kcmnt8c0000gn/T/tfe.config server0'\n", - "This can be done automatically in a local subprocess by setting `auto_managed=True` when instantiating a TFEWorker.\n", - "\n", - "INFO:tf_encrypted:If not done already, please launch the following command in a terminal on host localhost:4001: 'python -m tf_encrypted.player --config /var/folders/mh/7hh_sz1d3532_k9w4kcmnt8c0000gn/T/tfe.config server1'\n", - "This can be done automatically in a local subprocess by setting `auto_managed=True` when instantiating a TFEWorker.\n", - "\n", - "INFO:tf_encrypted:If not done already, please launch the following command in a terminal on host localhost:4002: 'python -m tf_encrypted.player --config /var/folders/mh/7hh_sz1d3532_k9w4kcmnt8c0000gn/T/tfe.config server2'\n", - "This can be done automatically in a local subprocess by setting `auto_managed=True` when instantiating a TFEWorker.\n", - "\n" - ] - } - ], - "source": [ - "AUTO = False\n", - "\n", - "alice = sy.TFEWorker(host='localhost:4000', auto_managed=AUTO)\n", - "bob = sy.TFEWorker(host='localhost:4001', auto_managed=AUTO)\n", - "carol = sy.TFEWorker(host='localhost:4002', auto_managed=AUTO)\n", - "\n", - "cluster = sy.TFECluster(alice, bob, carol)\n", - "cluster.start()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 4: Launch 3 Servers\n", - "\n", - "If you have chosen to manually control the workers (i.e. `AUTO = False`) then you now need to launch 3 servers. Look for the exact commands to run in the info messages printed above. You are looking for the following, where the `...` are actual file paths:\n", - "\n", - "- `python -m tf_encrypted.player --config ... server0`\n", - "- `python -m tf_encrypted.player --config ... server1`\n", - "- `python -m tf_encrypted.player --config ... server2`" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 5: Split the Model Into Shares\n", - "\n", - "Thanks to `sy.KerasHook(tf.keras)` you can call the `share` method to transform your model into a TF Encrypted Keras model.\n", - "\n", - "If you have asked to manually manage servers above then this step will not complete until they have all been launched. Note that your firewall may ask for Python to accept incoming connection." - ] - }, - { - "cell_type": "code", - "execution_count": 51, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "INFO:tf_encrypted:Starting session on target 'grpc://localhost:4000' using config graph_options {\n", - "}\n", - "\n" - ] - } - ], - "source": [ - "model.share(cluster)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 6: Serve the Model\n", - "\n", - "Perfect! Now by calling `model.serve`, your model is ready to provide some private predictions. You can set `num_requests` to set a limit on the number of predictions requests served by the model; if not specified then the model will be served until interrupted." - ] - }, - { - "cell_type": "code", - "execution_count": 52, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Served encrypted prediction 1 to client.\n", - "Served encrypted prediction 2 to client.\n", - "Served encrypted prediction 3 to client.\n" - ] - } - ], - "source": [ - "model.serve(num_requests=3)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 7: Run the Client\n", - "\n", - "At this point open up and run the companion notebook: Section 4b - Encrytped Keras Client" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 8: Shutdown the Servers\n", - "\n", - "Once your request limit above, the model will no longer be available for serving requests, but it's still secret shared between the three workers above. You can kill the workers by executing the cell below.\n", - "\n", - "**Congratulations** on finishing Part 12: Secure Classification with Syft Keras and TFE!" - ] - }, - { - "cell_type": "code", - "execution_count": 53, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "INFO:tf_encrypted:Please terminate the process on host 'localhost:4000'.\n", - "INFO:tf_encrypted:Please terminate the process on host 'localhost:4001'.\n", - "INFO:tf_encrypted:Please terminate the process on host 'localhost:4002'.\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Process ID 15442 has been killed.\n", - "Process ID 15438 has been killed.\n", - "Process ID 15432 has been killed.\n" - ] - } - ], - "source": [ - "model.stop()\n", - "cluster.stop()\n", - "\n", - "if not AUTO:\n", - " process_ids = !ps aux | grep '[p]ython -m tf_encrypted.player --config' | awk '{print $2}'\n", - " for process_id in process_ids:\n", - " !kill {process_id}\n", - " print(\"Process ID {id} has been killed.\".format(id=process_id))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Keystone Project - Mix and Match What You've Learned\n", - "\n", - "Description: Take two of the concepts you've learned about in this course (Encrypted Computation, Federated Learning, Differential Privacy) and combine them for a use case of your own design. Extra credit if you can get your demo working with [WebSocketWorkers](https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials/advanced/websockets-example-MNIST) instead of VirtualWorkers! Then take your demo or example application, write a blogpost, and share that blogpost in #general-discussion on OpenMined's slack!!!\n", - "\n", - "Inspiration:\n", - "- This Course's Code: https://github.com/Udacity/private-ai\n", - "- OpenMined's Tutorials: https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials\n", - "- OpenMined's Blog: https://blog.openmined.org" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} From d4a5fd07724ef3d0e3891b9fc56b555b049132d2 Mon Sep 17 00:00:00 2001 From: Nimanshi Jha <40792134+nimanshijha@users.noreply.github.com> Date: Mon, 26 Aug 2019 22:03:17 +0530 Subject: [PATCH 12/14] Add files via upload --- Section_4_Encrypted_Deep_Learning (1).ipynb | 2579 +++++++++++++++++++ 1 file changed, 2579 insertions(+) create mode 100644 Section_4_Encrypted_Deep_Learning (1).ipynb diff --git a/Section_4_Encrypted_Deep_Learning (1).ipynb b/Section_4_Encrypted_Deep_Learning (1).ipynb new file mode 100644 index 0000000..e52ab15 --- /dev/null +++ b/Section_4_Encrypted_Deep_Learning (1).ipynb @@ -0,0 +1,2579 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.8" + }, + "colab": { + "name": "Section 4 - Encrypted Deep Learning.ipynb", + "version": "0.3.2", + "provenance": [] + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "nCXTqNMt_E_A", + "colab_type": "text" + }, + "source": [ + "# Section: Encrypted Deep Learning\n", + "\n", + "- Lesson: Reviewing Additive Secret Sharing\n", + "- Lesson: Encrypted Subtraction and Public/Scalar Multiplication\n", + "- Lesson: Encrypted Computation in PySyft\n", + "- Project: Build an Encrypted Database\n", + "- Lesson: Encrypted Deep Learning in PyTorch\n", + "- Lesson: Encrypted Deep Learning in Keras\n", + "- Final Project" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "RIEPsqtX_E_C", + "colab_type": "text" + }, + "source": [ + "# Lesson: Reviewing Additive Secret Sharing\n", + "\n", + "_For more great information about SMPC protocols like this one, visit https://mortendahl.github.io. With permission, Morten's work directly inspired this first teaching segment._" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "F-00TtQm_E_D", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import random\n", + "import numpy as np\n", + "\n", + "BASE = 10\n", + "\n", + "PRECISION_INTEGRAL = 8\n", + "PRECISION_FRACTIONAL = 8\n", + "Q = 293973345475167247070445277780365744413\n", + "\n", + "PRECISION = PRECISION_INTEGRAL + PRECISION_FRACTIONAL\n", + "\n", + "assert(Q > BASE**PRECISION)\n", + "\n", + "def encode(rational):\n", + " upscaled = int(rational * BASE**PRECISION_FRACTIONAL)\n", + " field_element = upscaled % Q\n", + " return field_element\n", + "\n", + "def decode(field_element):\n", + " upscaled = field_element if field_element <= Q/2 else field_element - Q\n", + " rational = upscaled / BASE**PRECISION_FRACTIONAL\n", + " return rational\n", + "\n", + "def encrypt(secret):\n", + " first = random.randrange(Q)\n", + " second = random.randrange(Q)\n", + " third = (secret - first - second) % Q\n", + " return [first, second, third]\n", + "\n", + "def decrypt(sharing):\n", + " return sum(sharing) % Q\n", + "\n", + "def add(a, b):\n", + " c = list()\n", + " for i in range(len(a)):\n", + " c.append((a[i] + b[i]) % Q)\n", + " return tuple(c)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "RlwX1Cdb_E_H", + "colab_type": "code", + "outputId": "7d6dc36e-7834-4f89-f58d-109723c22684", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "x = encrypt(encode(5.5))\n", + "x" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "[254701775054821790509044042188852120019,\n", + " 255944101335251864290207221642490109455,\n", + " 77300814560260839341639291729939259352]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 51 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "MHFNPBXx_E_N", + "colab_type": "code", + "outputId": "08e846d9-a448-4ecc-db8c-40b0ab723d12", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "y = encrypt(encode(2.3))\n", + "y" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "[19257797378248405772458046520010852896,\n", + " 257627881806182697668801835019073289614,\n", + " 17087666290736143629185396241511601902]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 52 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "1IBVWGUc_E_R", + "colab_type": "code", + "outputId": "40551a8f-6774-474a-b708-6d03daa54f3b", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "z = add(x,y)\n", + "z" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(273959572433070196281502088708862972915,\n", + " 219598637666267314888563778881197654656,\n", + " 94388480850996982970824687971450861254)" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 53 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "C4oVbvY0_E_U", + "colab_type": "code", + "outputId": "90ff6789-b38f-4ec1-ea73-9eea8988d4ac", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "decode(decrypt(z))" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "7.79999999" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 54 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "p1q4WnWF_E_Y", + "colab_type": "text" + }, + "source": [ + "# Lesson: Encrypted Subtraction and Public/Scalar Multiplication" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "hUWxwLI1_E_Z", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "t3I7aYpP_E_d", + "colab_type": "code", + "colab": {} + }, + "source": [ + "field = 23740629843760239486723" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "sjGhPrtS_E_h", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = 5\n", + "\n", + "bob_x_share = 2372385723 # random number\n", + "alices_x_share = field - bob_x_share + x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "UcTF8nPU_E_k", + "colab_type": "code", + "outputId": "43afee8f-5a25-4313-96ce-337018e5e213", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "(bob_x_share + alices_x_share) % field" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "5" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 57 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "OEaJQDrw_E_r", + "colab_type": "code", + "colab": {} + }, + "source": [ + "field = 10\n", + "\n", + "x = 5\n", + "\n", + "bob_x_share = 8\n", + "alice_x_share = field - bob_x_share + x\n", + "\n", + "y = 1\n", + "\n", + "bob_y_share = 9\n", + "alice_y_share = field - bob_y_share + y" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Te0nQLmj_E_v", + "colab_type": "code", + "outputId": "d0e9df81-0054-4477-e04c-5b4e6039ee44", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "((bob_x_share + alice_x_share) - (bob_y_share + alice_y_share)) % field" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "4" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 59 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "cVdeL7nN_E_z", + "colab_type": "code", + "outputId": "4e0d250c-fa88-471b-8379-7ae8ae83adb2", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "((bob_x_share - bob_y_share) + (alice_x_share - alice_y_share)) % field" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "4" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 60 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "JblLTFgA_E_3", + "colab_type": "code", + "outputId": "a79369bb-9b04-4f1d-ff07-1cab218c43fa", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "bob_x_share + alice_x_share + bob_y_share + alice_y_share" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "26" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 61 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "DaVjbnhv_E_7", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob_z_share = (bob_x_share - bob_y_share)\n", + "alice_z_share = (alice_x_share - alice_y_share)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "QxSKSy_Z_E_-", + "colab_type": "code", + "outputId": "46d1f6e9-2fa6-429c-d9d2-68fd54880e0c", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "(bob_z_share + alice_z_share) % field" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "4" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 63 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "v8rS537y_FAC", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def sub(a, b):\n", + " c = list()\n", + " for i in range(len(a)):\n", + " c.append((a[i] - b[i]) % Q)\n", + " return tuple(c)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "CWPSVLV5_FAF", + "colab_type": "code", + "colab": {} + }, + "source": [ + "field = 10\n", + "\n", + "x = 5\n", + "\n", + "bob_x_share = 8\n", + "alice_x_share = field - bob_x_share + x\n", + "\n", + "y = 1\n", + "\n", + "bob_y_share = 9\n", + "alice_y_share = field - bob_y_share + y" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "5hJcVerk_FAI", + "colab_type": "code", + "outputId": "54be28a5-fe88-4ec0-cd20-9fd53b32a792", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "bob_x_share + alice_x_share" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "15" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 66 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "EBDljWvp_FAM", + "colab_type": "code", + "outputId": "6fa5174d-b139-45cd-da6d-56c8536dab8f", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "bob_y_share + alice_y_share" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "11" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 67 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "hJK0vm30_FAR", + "colab_type": "code", + "outputId": "e2f07c2d-ffb5-449e-d35f-14c387d9f771", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "((bob_y_share * 3) + (alice_y_share * 3)) % field" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "3" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 68 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "6OI1NqKv_FAZ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def imul(a, scalar):\n", + " \n", + " # logic here which can multiply by a public scalar\n", + " \n", + " c = list()\n", + " \n", + " for i in range(len(a)):\n", + " c.append((a[i] * scalar) % Q)\n", + " \n", + " return tuple(c)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "f__I3aK6_FAh", + "colab_type": "code", + "outputId": "56793abe-e0a3-44fc-fb29-399c93a815a5", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "x = encrypt(encode(5.5))\n", + "x" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "[255998558750199147828667522072747064879,\n", + " 212179180245338482592752454612531180290,\n", + " 119768951954796863719470578876003243657]" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 70 + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "qhM9qPzU_FAn", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = imul(x, 3)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Ipuat7b2_FAs", + "colab_type": "code", + "outputId": "f7f23a27-f4f4-4cb4-c671-f260a7356363", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "decode(decrypt(z))" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "16.5" + ] + }, + "metadata": { + "tags": [] + }, + "execution_count": 72 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "c9JjhMuk_FAx", + "colab_type": "text" + }, + "source": [ + "# Lesson: Encrypted Computation in PySyft" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "gBrNtDVJ_FA4", + "colab_type": "code", + "outputId": "00d41f96-fba9-4082-dbca-51808917a09d", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "source": [ + "import syft as sy\n", + "import torch as th\n", + "hook = sy.TorchHook(th)\n", + "from torch import nn, optim" + ], + "execution_count": 0, + "outputs": [ + { + "output_type": "error", + "ename": "ModuleNotFoundError", + "evalue": "ignored", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mimport\u001b[0m \u001b[0msyft\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0msy\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mtorch\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mth\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0mhook\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0msy\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mTorchHook\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mth\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mtorch\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mnn\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0moptim\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'syft'", + "", + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0;32m\nNOTE: If your import is failing due to a missing package, you can\nmanually install dependencies using either !pip or !apt.\n\nTo view examples of installing some common dependencies, click the\n\"Open Examples\" button below.\n\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n" + ] + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "d9tBO6h5_FA-", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob = sy.VirtualWorker(hook, id=\"bob\").add_worker(sy.local_worker)\n", + "alice = sy.VirtualWorker(hook, id=\"alice\").add_worker(sy.local_worker)\n", + "secure_worker = sy.VirtualWorker(hook, id=\"secure_worker\").add_worker(sy.local_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "DPqKJiEr_FBE", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4])\n", + "y = th.tensor([2,-1,1,0])" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "mJzUh4Ti_FBK", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = x.share(bob, alice, crypto_provider=secure_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "j98nlzjt_FBQ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "y = y.share(bob, alice, crypto_provider=secure_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "LKjW7kGL_FBW", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x + y\n", + "z.get()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "50f3xG8r_FBc", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x - y\n", + "z.get()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "LR-rR_xK_FBl", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x * y\n", + "z.get()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "zqVlBANT_FBr", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x > y\n", + "z.get()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "XOoSnUYQ_FBy", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x < y\n", + "z.get()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "D_aDd5yX_FB3", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x == y\n", + "z.get()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "F1csulNg_FB-", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Yws3mIsP_FCH", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4])\n", + "y = th.tensor([2,-1,1,0])\n", + "\n", + "x = x.fix_precision().share(bob, alice, crypto_provider=secure_worker)\n", + "y = y.fix_precision().share(bob, alice, crypto_provider=secure_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "OSJabmKK_FCM", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x + y\n", + "z.get().float_precision()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "nd6SFN-H_FC6", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x - y\n", + "z.get().float_precision()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "nf5folDq_FC-", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x * y\n", + "z.get().float_precision()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "eg2lI8BT_FDA", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x > y\n", + "z.get().float_precision()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "GJnM54TF_FDE", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x < y\n", + "z.get().float_precision()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "fVNJ4jC__FDH", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = x == y\n", + "z.get().float_precision()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "AUmT-6JT_FDM", + "colab_type": "text" + }, + "source": [ + "# Project: Build an Encrypted Database" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "d5XTS36r_FDO", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import string" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "4q_MCOJj_FDV", + "colab_type": "code", + "colab": {} + }, + "source": [ + "char2Index = ()\n", + "index2char = ()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "p9LWqfaH_FDa", + "colab_type": "code", + "colab": {} + }, + "source": [ + "for i,char in enumerator(' '+ string.ascii_lowercase + '0123456789' + string.punctuation) :\n", + " char2index[char] = 1\n", + " index2char[i] = char" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "YgaJQf74_FDf", + "colab_type": "code", + "colab": {} + }, + "source": [ + "str_input = \"Hello\"\n", + "max_len = 8" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "CNnjB2ko_FDn", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def string2values(str_input, max_len=8) :\n", + " \n", + " str_input = str_input[:max_len].lower()\n", + " \n", + " if(len(str_input) < max_len) :\n", + " str_input = str_input + \".\" * (max_len - len(str_input))\n", + " \n", + " values = list()\n", + " for char in str_input :\n", + " values.append(char2int[char])\n", + " \n", + " return th.tensor(values).long()\n", + " " + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "6JmMxY9e_FDu", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def one_hot(index, length) :\n", + " vect = th.zeros(length).long()\n", + " vect[index] = 1\n", + " return vect" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "ob0nGGTv_FDx", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def string2one_hot_matrix(str_input, max_len=8) :\n", + " \n", + " str_input = str_input[:max_len].lower()\n", + " \n", + " if(len(str_input) < max_len) :\n", + " str_input = str_input + \".\" * (max_len - len(str_input))\n", + " \n", + " char_vectors = list()\n", + " for char in str_input:\n", + " char_v = one_hot(char2int[char], len(int2char)).unsqueeze(0)\n", + " char_vectors.append(char_v)\n", + " \n", + " return th.cat(char_vectors, dim=0) \n", + "\n", + "\n", + "one_hot(char2index['p'], len(index2char))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "EBm3CeLEk8OX", + "colab_type": "code", + "colab": {} + }, + "source": [ + "matrix = string2one_hot_matrix(\"Hello\")" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "QMaDlA-xk8Cj", + "colab_type": "code", + "colab": {} + }, + "source": [ + "class EncryptedD8() :\n", + " \n", + " def __init__(self, *owners, max_key_len=8, max_val_len=8) :\n", + " self.max_key_len = 8\n", + " self.max_val_len = 8\n", + " \n", + " self.keys = list()\n", + " self.values = list()\n", + " self.owners = owners\n", + " \n", + " def add_entry(self, key, value) :\n", + " key = string2one_hot_matrix(key)\n", + " key = key.s" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "UofeyV-rk747", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FTZaRHSH_FD0", + "colab_type": "text" + }, + "source": [ + "# Lesson: Encrypted Deep Learning in PyTorch" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "xZLp7cEw_FD2", + "colab_type": "text" + }, + "source": [ + "### Train a Model" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "jdknrSM1_FD3", + "colab_type": "code", + "colab": {} + }, + "source": [ + "from torch import nn\n", + "from torch import optim\n", + "import torch.nn.functional as F\n", + "\n", + "# A Toy Dataset\n", + "data = th.tensor([[0,0],[0,1],[1,0],[1,1.]], requires_grad=True)\n", + "target = th.tensor([[0],[0],[1],[1.]], requires_grad=True)\n", + "\n", + "class Net(nn.Module):\n", + " def __init__(self):\n", + " super(Net, self).__init__()\n", + " self.fc1 = nn.Linear(2, 20)\n", + " self.fc2 = nn.Linear(20, 1)\n", + "\n", + " def forward(self, x):\n", + " x = self.fc1(x)\n", + " x = F.relu(x)\n", + " x = self.fc2(x)\n", + " return x\n", + "\n", + "# A Toy Model\n", + "model = Net()\n", + "\n", + "def train():\n", + " # Training Logic\n", + " opt = optim.SGD(params=model.parameters(),lr=0.1)\n", + " for iter in range(20):\n", + "\n", + " # 1) erase previous gradients (if they exist)\n", + " opt.zero_grad()\n", + "\n", + " # 2) make a prediction\n", + " pred = model(data)\n", + "\n", + " # 3) calculate how much we missed\n", + " loss = ((pred - target)**2).sum()\n", + "\n", + " # 4) figure out which weights caused us to miss\n", + " loss.backward()\n", + "\n", + " # 5) change those weights\n", + " opt.step()\n", + "\n", + " # 6) print our progress\n", + " print(loss.data)\n", + " \n", + "train()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "uw7CDGZL_FD6", + "colab_type": "code", + "colab": {} + }, + "source": [ + "model(data)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ujNmdP_G_FD9", + "colab_type": "text" + }, + "source": [ + "## Encrypt the Model and Data" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "GVumzc_8_FD-", + "colab_type": "code", + "colab": {} + }, + "source": [ + "encrypted_model = model.fix_precision().share(alice, bob, crypto_provider=secure_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "0v_iKkoY_FEA", + "colab_type": "code", + "colab": {} + }, + "source": [ + "list(encrypted_model.parameters())" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "ApNjz4ZI_FEE", + "colab_type": "code", + "colab": {} + }, + "source": [ + "encrypted_data = data.fix_precision().share(alice, bob, crypto_provider=secure_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "GFJE6HYW_FEM", + "colab_type": "code", + "colab": {} + }, + "source": [ + "encrypted_data" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "ysMjhwis_FEP", + "colab_type": "code", + "colab": {} + }, + "source": [ + "encrypted_prediction = encrypted_model(encrypted_data)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Yub0sHEh_FET", + "colab_type": "code", + "colab": {} + }, + "source": [ + "encrypted_prediction.get().float_precision()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9e-FRPW2_FEX", + "colab_type": "text" + }, + "source": [ + "# Lesson: Encrypted Deep Learning in Keras\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "xyW29UDi_FEX", + "colab_type": "text" + }, + "source": [ + "## Step 1: Public Training\n", + "\n", + "Welcome to this tutorial! In the following notebooks you will learn how to provide private predictions. By private predictions, we mean that the data is constantly encrypted throughout the entire process. At no point is the user sharing raw data, only encrypted (that is, secret shared) data. In order to provide these private predictions, Syft Keras uses a library called [TF Encrypted](https://github.com/tf-encrypted/tf-encrypted) under the hood. TF Encrypted combines cutting-edge cryptographic and machine learning techniques, but you don't have to worry about this and can focus on your machine learning application.\n", + "\n", + "You can start serving private predictions with only three steps:\n", + "- **Step 1**: train your model with normal Keras.\n", + "- **Step 2**: secure and serve your machine learning model (server).\n", + "- **Step 3**: query the secured model to receive private predictions (client). \n", + "\n", + "Alright, let's go through these three steps so you can deploy impactful machine learning services without sacrificing user privacy or model security.\n", + "\n", + "Huge shoutout to the Dropout Labs ([@dropoutlabs](https://twitter.com/dropoutlabs)) and TF Encrypted ([@tf_encrypted](https://twitter.com/tf_encrypted)) teams for their great work which makes this demo possible, especially: Jason Mancuso ([@jvmancuso](https://twitter.com/jvmancuso)), Yann Dupis ([@YannDupis](https://twitter.com/YannDupis)), and Morten Dahl ([@mortendahlcs](https://github.com/mortendahlcs)). \n", + "\n", + "_Demo Ref: https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials_" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tycvkjwj_FEY", + "colab_type": "text" + }, + "source": [ + "## Train Your Model in Keras\n", + "\n", + "To use privacy-preserving machine learning techniques for your projects you should not have to learn a new machine learning framework. If you have basic [Keras](https://keras.io/) knowledge, you can start using these techniques with Syft Keras. If you have never used Keras before, you can learn a bit more about it through the [Keras documentation](https://keras.io). \n", + "\n", + "Before serving private predictions, the first step is to train your model with normal Keras. As an example, we will train a model to classify handwritten digits. To train this model we will use the canonical [MNIST dataset](http://yann.lecun.com/exdb/mnist/).\n", + "\n", + "We borrow [this example](https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py) from the reference Keras repository. To train your classification model, you just run the cell below." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "R3_iOScU_FEa", + "colab_type": "code", + "colab": {} + }, + "source": [ + "from __future__ import print_function\n", + "import tensorflow.keras as keras\n", + "from tensorflow.keras.datasets import mnist\n", + "from tensorflow.keras.models import Sequential\n", + "from tensorflow.keras.layers import Dense, Dropout, Flatten\n", + "from tensorflow.keras.layers import Conv2D, AveragePooling2D\n", + "from tensorflow.keras.layers import Activation\n", + "\n", + "batch_size = 128\n", + "num_classes = 10\n", + "epochs = 2\n", + "\n", + "# input image dimensions\n", + "img_rows, img_cols = 28, 28\n", + "\n", + "# the data, split between train and test sets\n", + "(x_train, y_train), (x_test, y_test) = mnist.load_data()\n", + "\n", + "x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n", + "x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n", + "input_shape = (img_rows, img_cols, 1)\n", + "\n", + "x_train = x_train.astype('float32')\n", + "x_test = x_test.astype('float32')\n", + "x_train /= 255\n", + "x_test /= 255\n", + "print('x_train shape:', x_train.shape)\n", + "print(x_train.shape[0], 'train samples')\n", + "print(x_test.shape[0], 'test samples')\n", + "\n", + "# convert class vectors to binary class matrices\n", + "y_train = keras.utils.to_categorical(y_train, num_classes)\n", + "y_test = keras.utils.to_categorical(y_test, num_classes)\n", + "\n", + "model = Sequential()\n", + "\n", + "model.add(Conv2D(10, (3, 3), input_shape=input_shape))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Conv2D(32, (3, 3)))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Conv2D(64, (3, 3)))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Flatten())\n", + "model.add(Dense(num_classes, activation='softmax'))\n", + "\n", + "model.compile(loss=keras.losses.categorical_crossentropy,\n", + " optimizer=keras.optimizers.Adadelta(),\n", + " metrics=['accuracy'])\n", + "\n", + "model.fit(x_train, y_train,\n", + " batch_size=batch_size,\n", + " epochs=epochs,\n", + " verbose=1,\n", + " validation_data=(x_test, y_test))\n", + "score = model.evaluate(x_test, y_test, verbose=0)\n", + "print('Test loss:', score[0])\n", + "print('Test accuracy:', score[1])" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "9EBv5hry_FEe", + "colab_type": "code", + "colab": {} + }, + "source": [ + "## Save your model's weights for future private prediction\n", + "model.save('short-conv-mnist.h5')" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "kGF6Avbq_FEi", + "colab_type": "text" + }, + "source": [ + "## Step 2: Load and Serve the Model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "YBh3CZP4_FEj", + "colab_type": "text" + }, + "source": [ + "Now that you have a trained model with normal Keras, you are ready to serve some private predictions. We can do that using Syft Keras.\n", + "\n", + "To secure and serve this model, we will need three TFEWorkers (servers). This is because TF Encrypted under the hood uses an encryption technique called [multi-party computation (MPC)](https://en.wikipedia.org/wiki/Secure_multi-party_computation). The idea is to split the model weights and input data into shares, then send a share of each value to the different servers. The key property is that if you look at the share on one server, it reveals nothing about the original value (input data or model weights).\n", + "\n", + "We'll define a Syft Keras model like we did in the previous notebook. However, there is a trick: before instantiating this model, we'll run `hook = sy.KerasHook(tf.keras)`. This will add three important new methods to the Keras Sequential class:\n", + " - `share`: will secure your model via secret sharing; by default, it will use the SecureNN protocol from TF Encrypted to secret share your model between each of the three TFEWorkers. Most importantly, this will add the capability of providing predictions on encrypted data.\n", + " - `serve`: this function will launch a serving queue, so that the TFEWorkers can can accept prediction requests on the secured model from external clients.\n", + " - `shutdown_workers`: once you are done providing private predictions, you can shut down your model by running this function. It will direct you to shutdown the server processes manually if you've opted to manually manage each worker.\n", + "\n", + "If you want learn more about MPC, you can read this excellent [blog](https://mortendahl.github.io/2017/04/17/private-deep-learning-with-mpc/)." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "h90qiZQb_FEk", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import numpy as np\n", + "import tensorflow as tf\n", + "from tensorflow.keras import Sequential\n", + "from tensorflow.keras.layers import AveragePooling2D, Conv2D, Dense, Activation, Flatten, ReLU, Activation\n", + "\n", + "import syft as sy\n", + "hook = sy.KerasHook(tf.keras)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "a4ljBEz3_FEp", + "colab_type": "text" + }, + "source": [ + "## Model\n", + "\n", + "As you can see, we define almost the exact same model as before, except we provide a `batch_input_shape`. This allows TF Encrypted to better optimize the secure computations via predefined tensor shapes. For this MNIST demo, we'll send input data with the shape of (1, 28, 28, 1). \n", + "We also return the logit instead of softmax because this operation is complex to perform using MPC, and we don't need it to serve prediction requests." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ToesGyS1_FEr", + "colab_type": "code", + "colab": {} + }, + "source": [ + "num_classes = 10\n", + "input_shape = (1, 28, 28, 1)\n", + "\n", + "model = Sequential()\n", + "\n", + "model.add(Conv2D(10, (3, 3), batch_input_shape=input_shape))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Conv2D(32, (3, 3)))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Conv2D(64, (3, 3)))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Flatten())\n", + "model.add(Dense(num_classes, name=\"logit\"))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "zN7wM6tK_FEt", + "colab_type": "text" + }, + "source": [ + "### Load Pre-trained Weights\n", + "\n", + "With `load_weights` you can easily load the weights you have saved previously after training your model." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "fWS_ivtW_FEu", + "colab_type": "code", + "colab": {} + }, + "source": [ + "pre_trained_weights = 'short-conv-mnist.h5'\n", + "model.load_weights(pre_trained_weights)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "r1viq4MF_FEy", + "colab_type": "text" + }, + "source": [ + "## Step 3: Setup Your Worker Connectors\n", + "\n", + "Let's now connect to the TFEWorkers (`alice`, `bob`, and `carol`) required by TF Encrypted to perform private predictions. For each TFEWorker, you just have to specify a host. We then make combine these workers in a cluster.\n", + "\n", + "These workers run a [TensorFlow server](https://www.tensorflow.org/api_docs/python/tf/distribute/Server), which you can either manage manually (`AUTO = False`) or ask the workers to manage for you (`AUTO = True`). If choosing to manually manage them, you will be instructed to execute a terminal command on each worker's host device after calling `model.share()` below. If all workers are hosted on a single device (e.g. `localhost`), you can choose to have Syft automatically manage the worker's TensorFlow server." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "1FqMuUif_FE0", + "colab_type": "code", + "colab": {} + }, + "source": [ + "AUTO = False\n", + "\n", + "alice = sy.TFEWorker(host='localhost:4000', auto_managed=AUTO)\n", + "bob = sy.TFEWorker(host='localhost:4001', auto_managed=AUTO)\n", + "carol = sy.TFEWorker(host='localhost:4002', auto_managed=AUTO)\n", + "\n", + "cluster = sy.TFECluster(alice, bob, carol)\n", + "cluster.start()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "iLSl14IV_FE5", + "colab_type": "text" + }, + "source": [ + "## Step 4: Launch 3 Servers\n", + "\n", + "If you have chosen to manually control the workers (i.e. `AUTO = False`) then you now need to launch 3 servers. Look for the exact commands to run in the info messages printed above. You are looking for the following, where the `...` are actual file paths:\n", + "\n", + "- `python -m tf_encrypted.player --config ... server0`\n", + "- `python -m tf_encrypted.player --config ... server1`\n", + "- `python -m tf_encrypted.player --config ... server2`" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "xVjW6pV1_FFA", + "colab_type": "text" + }, + "source": [ + "## Step 5: Split the Model Into Shares\n", + "\n", + "Thanks to `sy.KerasHook(tf.keras)` you can call the `share` method to transform your model into a TF Encrypted Keras model.\n", + "\n", + "If you have asked to manually manage servers above then this step will not complete until they have all been launched. Note that your firewall may ask for Python to accept incoming connection." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "uc4x8fcr_FFA", + "colab_type": "code", + "colab": {} + }, + "source": [ + "model.share(cluster)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZgFjSLeB_FFD", + "colab_type": "text" + }, + "source": [ + "## Step 6: Serve the Model\n", + "\n", + "Perfect! Now by calling `model.serve`, your model is ready to provide some private predictions. You can set `num_requests` to set a limit on the number of predictions requests served by the model; if not specified then the model will be served until interrupted." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Dub2hMwF_FFE", + "colab_type": "code", + "colab": {} + }, + "source": [ + "model.serve(num_requests=3)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Vm6K-aly_FFI", + "colab_type": "text" + }, + "source": [ + "## Step 7: Run the Client\n", + "\n", + "At this point open up and run the companion notebook: Section 4b - Encrytped Keras Client" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ltTwdjlh_FFJ", + "colab_type": "text" + }, + "source": [ + "## Step 8: Shutdown the Servers\n", + "\n", + "Once your request limit above, the model will no longer be available for serving requests, but it's still secret shared between the three workers above. You can kill the workers by executing the cell below.\n", + "\n", + "**Congratulations** on finishing Part 12: Secure Classification with Syft Keras and TFE!" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "G50Ft-i9_FFK", + "colab_type": "code", + "colab": {} + }, + "source": [ + "model.stop()\n", + "cluster.stop()\n", + "\n", + "if not AUTO:\n", + " process_ids = !ps aux | grep '[p]ython -m tf_encrypted.player --config' | awk '{print $2}'\n", + " for process_id in process_ids:\n", + " !kill {process_id}\n", + " print(\"Process ID {id} has been killed.\".format(id=process_id))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "lKdWS71p_FFL", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "PJjFCDGV_FFO", + "colab_type": "text" + }, + "source": [ + "# Keystone Project - Mix and Match What You've Learned\n", + "\n", + "Description: Take two of the concepts you've learned about in this course (Encrypted Computation, Federated Learning, Differential Privacy) and combine them for a use case of your own design. Extra credit if you can get your demo working with [WebSocketWorkers](https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials/advanced/websockets-example-MNIST) instead of VirtualWorkers! Then take your demo or example application, write a blogpost, and share that blogpost in #general-discussion on OpenMined's slack!!!\n", + "\n", + "Inspiration:\n", + "- This Course's Code: https://github.com/Udacity/private-ai\n", + "- OpenMined's Tutorials: https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials\n", + "- OpenMined's Blog: https://blog.openmined.org" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "QNtWdtzx_FFO", + "colab_type": "code", + "colab": {} + }, + "source": [ + "import random\n", + "import numpy as np\n", + "import string\n", + "import syft as sy\n", + "import torch as th\n", + "hook = sy.TorchHook(th)\n", + "from torch import nn, optim\n", + "import torch.nn.functional as F\n", + "\n", + "BASE = 10\n", + "\n", + "PRECISION_INTEGRAL = 8\n", + "PRECISION_FRACTIONAL = 8\n", + "Q = 293973345475167247070445277780365744413\n", + "\n", + "PRECISION = PRECISION_INTEGRAL + PRECISION_FRACTIONAL\n", + "\n", + "assert(Q > BASE**PRECISION)\n", + "\n", + "def encode(rational):\n", + " upscaled = int(rational * BASE**PRECISION_FRACTIONAL)\n", + " field_element = upscaled % Q\n", + " return field_element\n", + "\n", + "def decode(field_element):\n", + " upscaled = field_element if field_element <= Q/2 else field_element - Q\n", + " rational = upscaled / BASE**PRECISION_FRACTIONAL\n", + " return rational\n", + "\n", + "def encrypt(secret):\n", + " first = random.randrange(Q)\n", + " second = random.randrange(Q)\n", + " third = (secret - first - second) % Q\n", + " return [first, second, third]\n", + "\n", + "def decrypt(sharing):\n", + " return sum(sharing) % Q\n", + "\n", + "def add(a, b):\n", + " c = list()\n", + " for i in range(len(a)):\n", + " c.append((a[i] + b[i]) % Q)\n", + " return tuple(c)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "kS5LGC6i_FFS", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = encrypt(encode(5.5))\n", + "x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "5OMWLWzI_FFV", + "colab_type": "code", + "colab": {} + }, + "source": [ + "y = encrypt(encode(2.3))\n", + "y" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "j3v6cKFz_FFc", + "colab_type": "code", + "colab": {} + }, + "source": [ + "z = add(x,y)\n", + "z" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "yDQz5dpc_FFe", + "colab_type": "code", + "colab": {} + }, + "source": [ + "decode(decrypt(z))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "9YMFLeLw_FFf", + "colab_type": "code", + "colab": {} + }, + "source": [ + "field = 23740629843760239486723" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "RAGJU2qo_FFi", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob_x_share = 2372385723\n", + "alices_x_share = field - bob_x_share + x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "pKPoPpAe_FFl", + "colab_type": "code", + "colab": {} + }, + "source": [ + "(bob_x_share + alices_x_share) % field" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "5ByOlEFi_FFq", + "colab_type": "code", + "colab": {} + }, + "source": [ + "((bob_x_share + alice_x_share) - (bob_y_share + alice_y_share)) % field" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "UvDWBYm4_FFx", + "colab_type": "code", + "colab": {} + }, + "source": [ + "((bob_x_share - bob_y_share) + (alice_x_share - alice_y_share)) % field" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "r2HirmgL_FF0", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob_x_share + alice_x_share + bob_y_share + alice_y_share" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "qtrMnRl-_FF3", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob_z_share = (bob_x_share - bob_y_share)\n", + "alice_z_share = (alice_x_share - alice_y_share)\n", + "(bob_z_share + alice_z_share) % field" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "2yI1uroD_FF7", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def sub(a, b):\n", + " c = list()\n", + " for i in range(len(a)):\n", + " c.append((a[i] - b[i]) % Q)\n", + " return tuple(c)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "zUw41eKh_FF-", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def imul(a, scalar):\n", + " \n", + " # logic here which can multiply by a public scalar\n", + " \n", + " c = list()\n", + " \n", + " for i in range(len(a)):\n", + " c.append((a[i] * scalar) % Q)\n", + " \n", + " return tuple(c)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "caIAcaLD_FGC", + "colab_type": "code", + "colab": {} + }, + "source": [ + "bob = sy.VirtualWorker(hook, id=\"bob\").add_worker(sy.local_worker)\n", + "alice = sy.VirtualWorker(hook, id=\"alice\").add_worker(sy.local_worker)\n", + "secure_worker = sy.VirtualWorker(hook, id=\"secure_worker\").add_worker(sy.local_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "VUuFoQXwtk0V", + "colab_type": "text" + }, + "source": [ + "" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "w0tcYI22_FGJ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4])\n", + "y = th.tensor([2,-1,1,0])" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "ikZk1LUe_FGQ", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = x.share(bob, alice, crypto_provider=secure_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "rLuYGDvl_FGS", + "colab_type": "code", + "colab": {} + }, + "source": [ + "y = y.share(bob, alice, crypto_provider=secure_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "u5RELoOf_FGV", + "colab_type": "code", + "colab": {} + }, + "source": [ + "x = th.tensor([1,2,3,4])\n", + "y = th.tensor([2,-1,1,0])\n", + "\n", + "x = x.fix_precision().share(bob, alice, crypto_provider=secure_worker)\n", + "y = y.fix_precision().share(bob, alice, crypto_provider=secure_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "PdU3jjHytyPU", + "colab_type": "code", + "colab": {} + }, + "source": [ + "char2index = {}\n", + "index2char = {}" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "e0nytsoAtyCj", + "colab_type": "code", + "colab": {} + }, + "source": [ + "for i,char in enumerate(' ' + string.ascii_lowercase + '0123456789' + string.punctuation):\n", + " char2index[char] = i\n", + " index2char[i] = char" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "K32ttgDitx1x", + "colab_type": "code", + "colab": {} + }, + "source": [ + "str_input = \"Hello\"\n", + " max_len = 8" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "ng0m0RK8txgR", + "colab_type": "code", + "colab": {} + }, + "source": [ + "def string2values(str_input, max_len=8):\n", + "\n", + " str_input = str_input[:max_len].lower()\n", + "\n", + " # pad strings shorter than max len\n", + " if(len(str_input) < max_len):\n", + " str_input = str_input + \".\" * (max_len - len(str_input))\n", + "\n", + " values = list()\n", + " for char in str_input:\n", + " values.append(char2index[char])\n", + "\n", + " return th.tensor(values).long()\n", + "\n", + "def values2string(input_values):\n", + " s = \"\"\n", + " for value in input_values:\n", + " s += index2char[int(value)]\n", + " return s\n", + "\n", + "def strings_equal(str_a, str_b):\n", + "\n", + " vect = (str_a * str_b).sum(1)\n", + "\n", + " x = vect[0]\n", + "\n", + " for i in range(vect.shape[0] - 1):\n", + " x = x * vect[i + 1] \n", + "\n", + " return x\n", + "\n", + "def one_hot(index, length):\n", + " vect = th.zeros(length).long()\n", + " vect[index] = 1\n", + " return vect\n", + "\n", + "def string2one_hot_matrix(str_input, max_len=8):\n", + "\n", + " str_input = str_input[:max_len].lower()\n", + "\n", + " # pad strings shorter than max len\n", + " if(len(str_input) < max_len):\n", + " str_input = str_input + \".\" * (max_len - len(str_input))\n", + "\n", + " char_vectors = list()\n", + " for char in str_input:\n", + " char_v = one_hot(char2index[char], len(char2index)).unsqueeze(0)\n", + " char_vectors.append(char_v)\n", + " \n", + " return th.cat(char_vectors, dim=0)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "CCEij5E3uQMn", + "colab_type": "code", + "colab": {} + }, + "source": [ + "class EncryptedDB():\n", + " \n", + " def __init__(self, *owners, max_key_len=8, max_val_len=8):\n", + " self.max_key_len = max_key_len\n", + " self.max_val_len = max_val_len\n", + " \n", + " self.keys = list()\n", + " self.values = list()\n", + " self.owners = owners\n", + " \n", + " def add_entry(self, key, value):\n", + " key = string2one_hot_matrix(key)\n", + " key = key.share(*self.owners)\n", + " self.keys.append(key)\n", + " \n", + " value = string2values(value, max_len=self.max_val_len)\n", + " value = value.share(*self.owners)\n", + " self.values.append(value)\n", + " \n", + " def query(self, query_str):\n", + " query_matrix = string2one_hot_matrix(query_str)\n", + " \n", + " query_matrix = query_matrix.share(*self.owners)\n", + "\n", + " key_matches = list()\n", + " for key in self.keys:\n", + "\n", + " key_match = strings_equal(key, query_matrix)\n", + " key_matches.append(key_match)\n", + "\n", + " result = self.values[0] * key_matches[0]\n", + "\n", + " for i in range(len(self.values) - 1):\n", + " result += self.values[i+1] * key_matches[i+1]\n", + " \n", + " result = result.get()\n", + "\n", + " return values2string(result).replace(\".\",\"\")" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "B0MYPG-guQJE", + "colab_type": "code", + "colab": {} + }, + "source": [ + "db = EncryptedDB(bob, alice, secure_worker, max_val_len=256)\n", + "\n", + "db.add_entry(\"Bob\",\"(123) 456 7890\")\n", + "db.add_entry(\"Bill\", \"(234) 567 8901\")\n", + "db.add_entry(\"Sam\",\"(345) 678 9012\")\n", + "db.add_entry(\"Key\",\"really big json value\")\n", + "\n", + "db.query(\"Bob\")" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "2MFHBcYcuQF2", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# A Toy Dataset\n", + "data = th.tensor([[0,0],[0,1],[1,0],[1,1.]], requires_grad=True)\n", + "target = th.tensor([[0],[0],[1],[1.]], requires_grad=True)\n", + "\n", + "class Net(nn.Module):\n", + " def __init__(self):\n", + " super(Net, self).__init__()\n", + " self.fc1 = nn.Linear(2, 20)\n", + " self.fc2 = nn.Linear(20, 1)\n", + "\n", + " def forward(self, x):\n", + " x = self.fc1(x)\n", + " x = F.relu(x)\n", + " x = self.fc2(x)\n", + " return x\n", + "\n", + "# A Toy Model\n", + "model = Net()\n", + "\n", + "def train():\n", + " # Training Logic\n", + " opt = optim.SGD(params=model.parameters(),lr=0.1)\n", + " for iter in range(20):\n", + "\n", + " # 1) erase previous gradients (if they exist)\n", + " opt.zero_grad()\n", + "\n", + " # 2) make a prediction\n", + " pred = model(data)\n", + "\n", + " # 3) calculate how much we missed\n", + " loss = ((pred - target)**2).sum()\n", + "\n", + " # 4) figure out which weights caused us to miss\n", + " loss.backward()\n", + "\n", + " # 5) change those weights\n", + " opt.step()\n", + "\n", + " # 6) print our progress\n", + " print(loss.data)\n", + " \n", + "train()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "-H6sQT7auQDP", + "colab_type": "code", + "colab": {} + }, + "source": [ + "model(data)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "5ePx1x9QuPze", + "colab_type": "code", + "colab": {} + }, + "source": [ + "encrypted_model = model.fix_precision().share(alice, bob, crypto_provider=secure_worker)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "VRCC0TFwu1_K", + "colab_type": "code", + "colab": {} + }, + "source": [ + "list(encrypted_model.parameters())" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Ovos_V-pu16z", + "colab_type": "code", + "colab": {} + }, + "source": [ + "encrypted_data = data.fix_precision().share(alice, bob, crypto_provider=secure_worker)\n", + "encrypted_data" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "LsSHmRrpu14A", + "colab_type": "code", + "colab": {} + }, + "source": [ + "encrypted_prediction = encrypted_model(encrypted_data)\n", + "encrypted_prediction.get().float_precision()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "aacqfYZnu1xp", + "colab_type": "code", + "colab": {} + }, + "source": [ + "from __future__ import print_function\n", + "import tensorflow.keras as keras\n", + "from tensorflow.keras.datasets import mnist\n", + "from tensorflow.keras.models import Sequential\n", + "from tensorflow.keras.layers import Dense, Dropout, Flatten\n", + "from tensorflow.keras.layers import Conv2D, AveragePooling2D\n", + "from tensorflow.keras.layers import Activation\n", + "\n", + "batch_size = 128\n", + "num_classes = 10\n", + "epochs = 2\n", + "\n", + "# input image dimensions\n", + "img_rows, img_cols = 28, 28\n", + "\n", + "# the data, split between train and test sets\n", + "(x_train, y_train), (x_test, y_test) = mnist.load_data()\n", + "\n", + "x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n", + "x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n", + "input_shape = (img_rows, img_cols, 1)\n", + "\n", + "x_train = x_train.astype('float32')\n", + "x_test = x_test.astype('float32')\n", + "x_train /= 255\n", + "x_test /= 255\n", + "print('x_train shape:', x_train.shape)\n", + "print(x_train.shape[0], 'train samples')\n", + "print(x_test.shape[0], 'test samples')\n", + "\n", + "# convert class vectors to binary class matrices\n", + "y_train = keras.utils.to_categorical(y_train, num_classes)\n", + "y_test = keras.utils.to_categorical(y_test, num_classes)\n", + "\n", + "model = Sequential()\n", + "\n", + "model.add(Conv2D(10, (3, 3), input_shape=input_shape))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Conv2D(32, (3, 3)))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Conv2D(64, (3, 3)))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Flatten())\n", + "model.add(Dense(num_classes, activation='softmax'))\n", + "\n", + "model.compile(loss=keras.losses.categorical_crossentropy,\n", + " optimizer=keras.optimizers.Adadelta(),\n", + " metrics=['accuracy'])\n", + "\n", + "model.fit(x_train, y_train,\n", + " batch_size=batch_size,\n", + " epochs=epochs,\n", + " verbose=1,\n", + " validation_data=(x_test, y_test))\n", + "score = model.evaluate(x_test, y_test, verbose=0)\n", + "print('Test loss:', score[0])\n", + "print('Test accuracy:', score[1])" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "6ydb7iVDu_8Y", + "colab_type": "code", + "colab": {} + }, + "source": [ + "num_classes = 10\n", + "input_shape = (1, 28, 28, 1)\n", + "\n", + "model = Sequential()\n", + "\n", + "model.add(Conv2D(10, (3, 3), batch_input_shape=input_shape))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Conv2D(32, (3, 3)))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Conv2D(64, (3, 3)))\n", + "model.add(AveragePooling2D((2, 2)))\n", + "model.add(Activation('relu'))\n", + "model.add(Flatten())\n", + "model.add(Dense(num_classes, name=\"logit\"))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "x0ExWe5Qu_4m", + "colab_type": "code", + "colab": {} + }, + "source": [ + "pre_trained_weights = 'short-conv-mnist.h5'\n", + "model.load_weights(pre_trained_weights)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "4zV7Y_CTu_yP", + "colab_type": "code", + "colab": {} + }, + "source": [ + "AUTO = False\n", + "\n", + "alice = sy.TFEWorker(host='localhost:4000', auto_managed=AUTO)\n", + "bob = sy.TFEWorker(host='localhost:4001', auto_managed=AUTO)\n", + "carol = sy.TFEWorker(host='localhost:4002', auto_managed=AUTO)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "PJ_K0CSJvdL9", + "colab_type": "code", + "colab": {} + }, + "source": [ + "model.share(alice, bob, carol)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "ps0XR5aMvty5", + "colab_type": "code", + "colab": {} + }, + "source": [ + "model.serve(num_requests=3)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "SlHixWX4vtqa", + "colab_type": "code", + "colab": {} + }, + "source": [ + "model.shutdown_workers()\n", + "\n", + "if not AUTO:\n", + " process_ids = !ps aux | grep '[p]ython -m tf_encrypted.player --config /tmp/tfe.config' | awk '{print $2}'\n", + " for process_id in process_ids:\n", + " !kill {process_id}\n", + " print(\"Process ID {id} has been killed.\".format(id=process_id))" + ], + "execution_count": 0, + "outputs": [] + } + ] +} \ No newline at end of file From 14543973f994506f13e6d97b87748bd2a5d0e7fb Mon Sep 17 00:00:00 2001 From: Nimanshi Jha <40792134+nimanshijha@users.noreply.github.com> Date: Mon, 26 Aug 2019 22:03:58 +0530 Subject: [PATCH 13/14] Delete Section 4b - Encrypted Keras Client.ipynb --- Section 4b - Encrypted Keras Client.ipynb | 153 ---------------------- 1 file changed, 153 deletions(-) delete mode 100644 Section 4b - Encrypted Keras Client.ipynb diff --git a/Section 4b - Encrypted Keras Client.ipynb b/Section 4b - Encrypted Keras Client.ipynb deleted file mode 100644 index 12619b4..0000000 --- a/Section 4b - Encrypted Keras Client.ipynb +++ /dev/null @@ -1,153 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Step 2: Private Prediction using Syft Keras - Serving (Client)\n", - "\n", - "_Demo Ref: https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials_\n", - "\n", - "Congratulations! After training your model with normal Keras and securing it with Syft Keras, you are ready to request some private predictions. " - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import tensorflow as tf\n", - "from tensorflow.keras.datasets import mnist\n", - "\n", - "import syft as sy" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [], - "source": [ - "# input image dimensions\n", - "img_rows, img_cols = 28, 28\n", - "\n", - "# the data, split between train and test sets\n", - "(x_train, y_train), (x_test, y_test) = mnist.load_data()\n", - "\n", - "x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n", - "x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n", - "input_shape = (img_rows, img_cols, 1)\n", - "\n", - "x_train = x_train.astype('float32')\n", - "x_test = x_test.astype('float32')\n", - "x_train /= 255\n", - "x_test /= 255" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "num_classes = 10\n", - "input_shape = (1, 28, 28, 1)\n", - "output_shape = (1, num_classes)" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "INFO:tf_encrypted:Starting session on target 'grpc://localhost:4000' using config graph_options {\n", - "}\n", - "\n" - ] - } - ], - "source": [ - "client = sy.TFEWorker()\n", - "\n", - "alice = sy.TFEWorker(host='localhost:4000')\n", - "bob = sy.TFEWorker(host='localhost:4001')\n", - "carol = sy.TFEWorker(host='localhost:4002')\n", - "cluster = sy.TFECluster(alice, bob, carol)\n", - "\n", - "client.connect_to_model(input_shape, output_shape, cluster)" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "# User inputs\n", - "num_tests = 3\n", - "images, expected_labels = x_test[:num_tests], y_test[:num_tests]" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "The image had label 7 and was correctly classified as 7\n", - "The image had label 2 and was correctly classified as 2\n", - "The image had label 1 and was correctly classified as 1\n" - ] - } - ], - "source": [ - "for image, expected_label in zip(images, expected_labels):\n", - "\n", - " res = client.query_model(image.reshape(1, 28, 28, 1))\n", - " predicted_label = np.argmax(res)\n", - "\n", - " print(\"The image had label {} and was {} classified as {}\".format(\n", - " expected_label,\n", - " \"correctly\" if expected_label == predicted_label else \"wrongly\",\n", - " predicted_label))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} From e5d5f83813f30658ab1c8724a737d96307870fc3 Mon Sep 17 00:00:00 2001 From: Nimanshi Jha <40792134+nimanshijha@users.noreply.github.com> Date: Mon, 26 Aug 2019 22:21:13 +0530 Subject: [PATCH 14/14] Add files via upload --- Section_4b_Encrypted_Keras_Client (1).ipynb | 338 ++++++++++++++++++++ 1 file changed, 338 insertions(+) create mode 100644 Section_4b_Encrypted_Keras_Client (1).ipynb diff --git a/Section_4b_Encrypted_Keras_Client (1).ipynb b/Section_4b_Encrypted_Keras_Client (1).ipynb new file mode 100644 index 0000000..367a787 --- /dev/null +++ b/Section_4b_Encrypted_Keras_Client (1).ipynb @@ -0,0 +1,338 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.8" + }, + "colab": { + "name": "Section 4b - Encrypted Keras Client.ipynb", + "version": "0.3.2", + "provenance": [] + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "swud1891xAcK", + "colab_type": "text" + }, + "source": [ + "# Step 2: Private Prediction using Syft Keras - Serving (Client)\n", + "\n", + "_Demo Ref: https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials_\n", + "\n", + "Congratulations! After training your model with normal Keras and securing it with Syft Keras, you are ready to request some private predictions. " + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "MJ5Js1JR0fbq", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 1000 + }, + "outputId": "9b0b5a57-208e-4af8-a77a-c6e39ec91b0e" + }, + "source": [ + "pip install syft" + ], + "execution_count": 3, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Collecting syft\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/41/ff/cbde877f0eaa397c56fed4b73d57c04ef07cf368df6cc10e75eda479f56e/syft-0.1.24a1-py3-none-any.whl (284kB)\n", + "\r\u001b[K |█▏ | 10kB 13.0MB/s eta 0:00:01\r\u001b[K |██▎ | 20kB 1.7MB/s eta 0:00:01\r\u001b[K |███▌ | 30kB 2.6MB/s eta 0:00:01\r\u001b[K |████▋ | 40kB 1.7MB/s eta 0:00:01\r\u001b[K |█████▊ | 51kB 2.1MB/s eta 0:00:01\r\u001b[K |███████ | 61kB 2.5MB/s eta 0:00:01\r\u001b[K |████████ | 71kB 2.9MB/s eta 0:00:01\r\u001b[K |█████████▏ | 81kB 3.2MB/s eta 0:00:01\r\u001b[K |██████████▍ | 92kB 3.6MB/s eta 0:00:01\r\u001b[K |███████████▌ | 102kB 2.8MB/s eta 0:00:01\r\u001b[K |████████████▊ | 112kB 2.8MB/s eta 0:00:01\r\u001b[K |█████████████▉ | 122kB 2.8MB/s eta 0:00:01\r\u001b[K |███████████████ | 133kB 2.8MB/s eta 0:00:01\r\u001b[K |████████████████▏ | 143kB 2.8MB/s eta 0:00:01\r\u001b[K |█████████████████▎ | 153kB 2.8MB/s eta 0:00:01\r\u001b[K |██████████████████▍ | 163kB 2.8MB/s eta 0:00:01\r\u001b[K |███████████████████▋ | 174kB 2.8MB/s eta 0:00:01\r\u001b[K |████████████████████▊ | 184kB 2.8MB/s eta 0:00:01\r\u001b[K |█████████████████████▉ | 194kB 2.8MB/s eta 0:00:01\r\u001b[K |███████████████████████ | 204kB 2.8MB/s eta 0:00:01\r\u001b[K |████████████████████████▏ | 215kB 2.8MB/s eta 0:00:01\r\u001b[K |█████████████████████████▍ | 225kB 2.8MB/s eta 0:00:01\r\u001b[K |██████████████████████████▌ | 235kB 2.8MB/s eta 0:00:01\r\u001b[K |███████████████████████████▋ | 245kB 2.8MB/s eta 0:00:01\r\u001b[K |████████████████████████████▉ | 256kB 2.8MB/s eta 0:00:01\r\u001b[K |██████████████████████████████ | 266kB 2.8MB/s eta 0:00:01\r\u001b[K |███████████████████████████████ | 276kB 2.8MB/s eta 0:00:01\r\u001b[K |████████████████████████████████| 286kB 2.8MB/s \n", + "\u001b[?25hCollecting msgpack>=0.6.1 (from syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/92/7e/ae9e91c1bb8d846efafd1f353476e3fd7309778b582d2fb4cea4cc15b9a2/msgpack-0.6.1-cp36-cp36m-manylinux1_x86_64.whl (248kB)\n", + "\r\u001b[K |█▎ | 10kB 14.9MB/s eta 0:00:01\r\u001b[K |██▋ | 20kB 21.7MB/s eta 0:00:01\r\u001b[K |████ | 30kB 28.4MB/s eta 0:00:01\r\u001b[K |█████▎ | 40kB 33.5MB/s eta 0:00:01\r\u001b[K |██████▋ | 51kB 36.7MB/s eta 0:00:01\r\u001b[K |████████ | 61kB 40.7MB/s eta 0:00:01\r\u001b[K |█████████▏ | 71kB 43.2MB/s eta 0:00:01\r\u001b[K |██████████▌ | 81kB 45.0MB/s eta 0:00:01\r\u001b[K |███████████▉ | 92kB 47.6MB/s eta 0:00:01\r\u001b[K |█████████████▏ | 102kB 49.8MB/s eta 0:00:01\r\u001b[K |██████████████▌ | 112kB 49.8MB/s eta 0:00:01\r\u001b[K |███████████████▉ | 122kB 49.8MB/s eta 0:00:01\r\u001b[K |█████████████████▏ | 133kB 49.8MB/s eta 0:00:01\r\u001b[K |██████████████████▍ | 143kB 49.8MB/s eta 0:00:01\r\u001b[K |███████████████████▊ | 153kB 49.8MB/s eta 0:00:01\r\u001b[K |█████████████████████ | 163kB 49.8MB/s eta 0:00:01\r\u001b[K |██████████████████████▍ | 174kB 49.8MB/s eta 0:00:01\r\u001b[K |███████████████████████▊ | 184kB 49.8MB/s eta 0:00:01\r\u001b[K |█████████████████████████ | 194kB 49.8MB/s eta 0:00:01\r\u001b[K |██████████████████████████▍ | 204kB 49.8MB/s eta 0:00:01\r\u001b[K |███████████████████████████▋ | 215kB 49.8MB/s eta 0:00:01\r\u001b[K |█████████████████████████████ | 225kB 49.8MB/s eta 0:00:01\r\u001b[K |██████████████████████████████▎ | 235kB 49.8MB/s eta 0:00:01\r\u001b[K |███████████████████████████████▋| 245kB 49.8MB/s eta 0:00:01\r\u001b[K |████████████████████████████████| 256kB 49.8MB/s \n", + "\u001b[?25hRequirement already satisfied: scikit-learn>=0.21.0 in /usr/local/lib/python3.6/dist-packages (from syft) (0.21.3)\n", + "Collecting flask-socketio>=3.3.2 (from syft)\n", + " Downloading https://files.pythonhosted.org/packages/66/44/edc4715af85671b943c18ac8345d0207972284a0cd630126ff5251faa08b/Flask_SocketIO-4.2.1-py2.py3-none-any.whl\n", + "Requirement already satisfied: tblib>=1.4.0 in /usr/local/lib/python3.6/dist-packages (from syft) (1.4.0)\n", + "Requirement already satisfied: Flask>=1.0.2 in /usr/local/lib/python3.6/dist-packages (from syft) (1.1.1)\n", + "Collecting websocket-client>=0.56.0 (from syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/29/19/44753eab1fdb50770ac69605527e8859468f3c0fd7dc5a76dd9c4dbd7906/websocket_client-0.56.0-py2.py3-none-any.whl (200kB)\n", + "\r\u001b[K |█▋ | 10kB 15.8MB/s eta 0:00:01\r\u001b[K |███▎ | 20kB 23.9MB/s eta 0:00:01\r\u001b[K |█████ | 30kB 30.0MB/s eta 0:00:01\r\u001b[K |██████▌ | 40kB 35.3MB/s eta 0:00:01\r\u001b[K |████████▏ | 51kB 39.2MB/s eta 0:00:01\r\u001b[K |█████████▉ | 61kB 43.5MB/s eta 0:00:01\r\u001b[K |███████████▍ | 71kB 46.3MB/s eta 0:00:01\r\u001b[K |█████████████ | 81kB 47.7MB/s eta 0:00:01\r\u001b[K |██████████████▊ | 92kB 50.1MB/s eta 0:00:01\r\u001b[K |████████████████▍ | 102kB 51.7MB/s eta 0:00:01\r\u001b[K |██████████████████ | 112kB 51.7MB/s eta 0:00:01\r\u001b[K |███████████████████▋ | 122kB 51.7MB/s eta 0:00:01\r\u001b[K |█████████████████████▎ | 133kB 51.7MB/s eta 0:00:01\r\u001b[K |██████████████████████▉ | 143kB 51.7MB/s eta 0:00:01\r\u001b[K |████████████████████████▌ | 153kB 51.7MB/s eta 0:00:01\r\u001b[K |██████████████████████████▏ | 163kB 51.7MB/s eta 0:00:01\r\u001b[K |███████████████████████████▊ | 174kB 51.7MB/s eta 0:00:01\r\u001b[K |█████████████████████████████▍ | 184kB 51.7MB/s eta 0:00:01\r\u001b[K |███████████████████████████████ | 194kB 51.7MB/s eta 0:00:01\r\u001b[K |████████████████████████████████| 204kB 51.7MB/s \n", + "\u001b[?25hCollecting tf-encrypted!=0.5.7,>=0.5.4 (from syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/1f/82/cf15aeac92525da2f794956712e7ebf418819390dec783430ee242b52d0b/tf_encrypted-0.5.8-py3-none-manylinux1_x86_64.whl (2.1MB)\n", + "\u001b[K |████████████████████████████████| 2.1MB 52.1MB/s \n", + "\u001b[?25hRequirement already satisfied: torch==1.1 in /usr/local/lib/python3.6/dist-packages (from syft) (1.1.0)\n", + "Requirement already satisfied: torchvision==0.3.0 in /usr/local/lib/python3.6/dist-packages (from syft) (0.3.0)\n", + "Collecting websockets>=7.0 (from syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/f0/4b/ad228451b1c071c5c52616b7d4298ebcfcac5ae8515ede959db19e4cd56d/websockets-8.0.2-cp36-cp36m-manylinux1_x86_64.whl (72kB)\n", + "\u001b[K |████████████████████████████████| 81kB 27.8MB/s \n", + "\u001b[?25hCollecting zstd>=1.4.0.0 (from syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/22/37/6a7ba746ebddbd6cd06de84367515d6bc239acd94fb3e0b1c85788176ca2/zstd-1.4.1.0.tar.gz (454kB)\n", + "\u001b[K |████████████████████████████████| 460kB 46.4MB/s \n", + "\u001b[?25hRequirement already satisfied: numpy>=1.14.0 in /usr/local/lib/python3.6/dist-packages (from syft) (1.16.4)\n", + "Collecting lz4>=2.1.6 (from syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/0a/c6/96bbb3525a63ebc53ea700cc7d37ab9045542d33b4d262d0f0408ad9bbf2/lz4-2.1.10-cp36-cp36m-manylinux1_x86_64.whl (385kB)\n", + "\u001b[K |████████████████████████████████| 389kB 49.0MB/s \n", + "\u001b[?25hRequirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.6/dist-packages (from scikit-learn>=0.21.0->syft) (0.13.2)\n", + "Requirement already satisfied: scipy>=0.17.0 in /usr/local/lib/python3.6/dist-packages (from scikit-learn>=0.21.0->syft) (1.3.1)\n", + "Collecting python-socketio>=4.3.0 (from flask-socketio>=3.3.2->syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/35/b0/22c3f785f23fec5c7a815f47c55d7e7946a67ae2129ff604148e939d3bdb/python_socketio-4.3.1-py2.py3-none-any.whl (49kB)\n", + "\u001b[K |████████████████████████████████| 51kB 17.9MB/s \n", + "\u001b[?25hRequirement already satisfied: Jinja2>=2.10.1 in /usr/local/lib/python3.6/dist-packages (from Flask>=1.0.2->syft) (2.10.1)\n", + "Requirement already satisfied: itsdangerous>=0.24 in /usr/local/lib/python3.6/dist-packages (from Flask>=1.0.2->syft) (1.1.0)\n", + "Requirement already satisfied: click>=5.1 in /usr/local/lib/python3.6/dist-packages (from Flask>=1.0.2->syft) (7.0)\n", + "Requirement already satisfied: Werkzeug>=0.15 in /usr/local/lib/python3.6/dist-packages (from Flask>=1.0.2->syft) (0.15.5)\n", + "Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from websocket-client>=0.56.0->syft) (1.12.0)\n", + "Requirement already satisfied: tensorflow<2,>=1.12.0 in /usr/local/lib/python3.6/dist-packages (from tf-encrypted!=0.5.7,>=0.5.4->syft) (1.14.0)\n", + "Collecting pyyaml>=5.1 (from tf-encrypted!=0.5.7,>=0.5.4->syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.2.tar.gz (265kB)\n", + "\u001b[K |████████████████████████████████| 266kB 52.1MB/s \n", + "\u001b[?25hRequirement already satisfied: pillow>=4.1.1 in /usr/local/lib/python3.6/dist-packages (from torchvision==0.3.0->syft) (4.3.0)\n", + "Collecting python-engineio>=3.9.0 (from python-socketio>=4.3.0->flask-socketio>=3.3.2->syft)\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/2b/20/8e3ba16102ae2e245d70d9cb9fa48b076253fdb036dc43eea142294c2897/python_engineio-3.9.3-py2.py3-none-any.whl (119kB)\n", + "\u001b[K |████████████████████████████████| 122kB 47.7MB/s \n", + "\u001b[?25hRequirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.6/dist-packages (from Jinja2>=2.10.1->Flask>=1.0.2->syft) (1.1.1)\n", + "Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.15.0)\n", + "Requirement already satisfied: absl-py>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.7.1)\n", + "Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (3.7.1)\n", + "Requirement already satisfied: google-pasta>=0.1.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.1.7)\n", + "Requirement already satisfied: gast>=0.2.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.2.2)\n", + "Requirement already satisfied: astor>=0.6.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.8.0)\n", + "Requirement already satisfied: keras-applications>=1.0.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.0.8)\n", + "Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.1.0)\n", + "Requirement already satisfied: tensorboard<1.15.0,>=1.14.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.14.0)\n", + "Requirement already satisfied: tensorflow-estimator<1.15.0rc0,>=1.14.0rc0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.14.0)\n", + "Requirement already satisfied: wrapt>=1.11.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.11.2)\n", + "Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (0.33.4)\n", + "Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (1.1.0)\n", + "Requirement already satisfied: olefile in /usr/local/lib/python3.6/dist-packages (from pillow>=4.1.1->torchvision==0.3.0->syft) (0.46)\n", + "Requirement already satisfied: setuptools in /usr/local/lib/python3.6/dist-packages (from protobuf>=3.6.1->tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (41.2.0)\n", + "Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from keras-applications>=1.0.6->tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (2.8.0)\n", + "Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow<2,>=1.12.0->tf-encrypted!=0.5.7,>=0.5.4->syft) (3.1.1)\n", + "Building wheels for collected packages: zstd, pyyaml\n", + " Building wheel for zstd (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + " Created wheel for zstd: filename=zstd-1.4.1.0-cp36-cp36m-linux_x86_64.whl size=1067082 sha256=66952fde45ad8837c90705485f9b7f543e5c318a9f53abb3f24e4d3deb712f74\n", + " Stored in directory: /root/.cache/pip/wheels/66/3f/ee/ac08c81af7c1b24a80c746df669ea3cb37542d27877d66ccf4\n", + " Building wheel for pyyaml (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + " Created wheel for pyyaml: filename=PyYAML-5.1.2-cp36-cp36m-linux_x86_64.whl size=44105 sha256=95d0bcf528fcdc0461223ca0c5fd16c7a16eec1cdd54ddaa75110fbf6d58288a\n", + " Stored in directory: /root/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030\n", + "Successfully built zstd pyyaml\n", + "Installing collected packages: msgpack, python-engineio, python-socketio, flask-socketio, websocket-client, pyyaml, tf-encrypted, websockets, zstd, lz4, syft\n", + " Found existing installation: msgpack 0.5.6\n", + " Uninstalling msgpack-0.5.6:\n", + " Successfully uninstalled msgpack-0.5.6\n", + " Found existing installation: PyYAML 3.13\n", + " Uninstalling PyYAML-3.13:\n", + " Successfully uninstalled PyYAML-3.13\n", + "Successfully installed flask-socketio-4.2.1 lz4-2.1.10 msgpack-0.6.1 python-engineio-3.9.3 python-socketio-4.3.1 pyyaml-5.1.2 syft-0.1.24a1 tf-encrypted-0.5.8 websocket-client-0.56.0 websockets-8.0.2 zstd-1.4.1.0\n" + ], + "name": "stdout" + }, + { + "output_type": "display_data", + "data": { + "application/vnd.colab-display-data+json": { + "pip_warning": { + "packages": [ + "yaml" + ] + } + } + }, + "metadata": { + "tags": [] + } + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ZszwJEprxAcN", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 113 + }, + "outputId": "c82466c7-16f2-44ea-998e-c6cc0c782284" + }, + "source": [ + "import numpy as np\n", + "import tensorflow as tf\n", + "from tensorflow.keras.datasets import mnist\n", + "import syft as sy" + ], + "execution_count": 4, + "outputs": [ + { + "output_type": "stream", + "text": [ + "WARNING: Logging before flag parsing goes to stderr.\n", + "W0826 16:46:49.061327 139877618374528 secure_random.py:26] Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was '/usr/local/lib/python3.6/dist-packages/tf_encrypted/operations/secure_random/secure_random_module_tf_1.14.0.so'\n", + "W0826 16:46:49.078113 139877618374528 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/tf_encrypted/session.py:26: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.\n", + "\n" + ], + "name": "stderr" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "q3biE-zFxAcT", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 55 + }, + "outputId": "5e56c47c-fa89-43d8-eba9-bae85df9d4c7" + }, + "source": [ + "# input image dimensions\n", + "img_rows, img_cols = 28, 28\n", + "\n", + "# the data, split between train and test sets\n", + "(x_train, y_train), (x_test, y_test) = mnist.load_data()\n", + "\n", + "x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n", + "x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n", + "input_shape = (img_rows, img_cols, 1)\n", + "\n", + "x_train = x_train.astype('float32')\n", + "x_test = x_test.astype('float32')\n", + "x_train /= 255\n", + "x_test /= 255" + ], + "execution_count": 5, + "outputs": [ + { + "output_type": "stream", + "text": [ + "Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz\n", + "11493376/11490434 [==============================] - 0s 0us/step\n" + ], + "name": "stdout" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "n07-QQ7rxAcb", + "colab_type": "code", + "colab": {} + }, + "source": [ + "num_classes = 10\n", + "input_shape = (1, 28, 28, 1)\n", + "output_shape = (1, num_classes)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "UzQt4mSlxAcf", + "colab_type": "code", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 208 + }, + "outputId": "49284754-432a-4bb1-802f-4eb247e2e50c" + }, + "source": [ + "client = sy.TFEWorker()\n", + "\n", + "alice = sy.TFEWorker(host='localhost:4000')\n", + "bob = sy.TFEWorker(host='localhost:4001')\n", + "carol = sy.TFEWorker(host='localhost:4002')\n", + "cluster = sy.TFECluster(alice, bob, carol)\n", + "\n", + "client.connect_to_model(input_shape, output_shape, cluster)" + ], + "execution_count": 7, + "outputs": [ + { + "output_type": "stream", + "text": [ + "W0826 16:48:02.842378 139877618374528 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/tf_encrypted/tensor/native.py:403: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.\n", + "\n", + "W0826 16:48:02.857806 139877618374528 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/tf_encrypted/config.py:300: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.\n", + "\n", + "W0826 16:48:02.860337 139877618374528 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/tf_encrypted/config.py:87: The name tf.GraphOptions is deprecated. Please use tf.compat.v1.GraphOptions instead.\n", + "\n", + "I0826 16:48:02.861839 139877618374528 session.py:55] Starting session on target 'grpc://localhost:4000' using config graph_options {\n", + "}\n", + "\n" + ], + "name": "stderr" + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "3I0A3LjexAck", + "colab_type": "code", + "colab": {} + }, + "source": [ + "# User inputs\n", + "num_tests = 3\n", + "images, expected_labels = x_test[:num_tests], y_test[:num_tests]" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "_MjiaB8LxAcp", + "colab_type": "code", + "colab": {} + }, + "source": [ + "for image, expected_label in zip(images, expected_labels):\n", + "\n", + " res = client.query_model(image.reshape(1, 28, 28, 1))\n", + " predicted_label = np.argmax(res)\n", + "\n", + " print(\"The image had label {} and was {} classified as {}\".format(\n", + " expected_label,\n", + " \"correctly\" if expected_label == predicted_label else \"wrongly\",\n", + " predicted_label))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "hOywI6HCxAct", + "colab_type": "code", + "colab": {} + }, + "source": [ + "" + ], + "execution_count": 0, + "outputs": [] + } + ] +} \ No newline at end of file