I am currently doing an experiment on a dataset using differential privacy concepts. So, I am trying to implement one of the mechanisms of differential privacy namely Laplace mechanisms using a sample dataset from UCI Machine Repository and python programming language.
Let's assume that we have simple counting query where we want to know the number of people who earns '<=50k' which are grouped by their 'occupation'
SELECT
adult.occupation, COUNT(adult.salary_group) As NumofPeople
FROM
adult
WHERE
adult.salary_group = '<=50K'
GROUP BY
adult.occupation, adult.salary_group;
and this is the Laplace function I am trying to use
import numpy as np
def laplaceMechanism(x, epsilon):
x += np.random.laplace(0, 1.0/epsilon, 1)[0]
return x
So, my question is how could I apply the function against the the data I got if we take epsilon=2
, I know that Laplace Mechanism works by adding a random noise from the la place distribution to the true answer we get from the query. A bit of insight would be appreciated...