Skip to content

Commit

Permalink
First commit.
Browse files Browse the repository at this point in the history
- Code made public under a GPL license.
  • Loading branch information
pdroalves committed Sep 6, 2016
0 parents commit 1761dab
Show file tree
Hide file tree
Showing 45 changed files with 3,986 additions and 0 deletions.
11 changes: 11 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
*.o
*.png
*.out
*.pyc
*.dat
*.swp __pycache__
*.a
*.so
*.pyc
*.swp
*~
621 changes: 621 additions & 0 deletions COPYING.md

Large diffs are not rendered by default.

36 changes: 36 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# mongodb-secure

[University of Campinas](http://www.unicamp.br), [Institute of Computing](http://www.ic.unicamp.br), Brazil.

Laboratory of Security and Cryptography - [LASCA](http://www.lasca.ic.unicamp.br),<br>
Laboratório Multidisciplinar de Computação de Alto Desempenho - [LMCAD](http://www.lmcad.ic.unicamp.br). <br>

Author: [Pedro G. M. R. Alves](http://www.iampedro.com), PhD. student @ IC-UNICAMP,<br/>
Advisor: [Diego F. Aranha](http://www.ic.unicamp.br/~dfaranha). <br/>

## About

This is a proof of concept implementation of the framework proposed by [Alves and Aranha (2016)] with the purpose of offering a wrapper on MongoDB's Python driver that enables a application to store and query encrypted data on the database.

## Citing
Please cite using the template below:

@INPROCEEDINGS{Alves2016,
author = {{Alves, Pedro and Aranha, Diego}},
title = {{A framework for searching encrypted databases}},
year = 2016,
BOOKTITLE= {{Anais do XVI Simpósio Brasileiro em Segurança da Informação e de Sistemas Computacionais (SBSeg 2016)}},
}


# Licensing

mongodb-secure is released under an GPLv3 license.

## Disclaimer

This is a proof of concept implementation. We do not recommend the use of this code on production and we do no claim that the cipher implementations here provided are correct or secure. Use at your own risk.


**Privacy Warning:** This site tracks visitor information.

7 changes: 7 additions & 0 deletions src/benchmarks/benchmark.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/bin/bash
#
sizes=(100 1000 10000 100000)
for size in ${sizes[@]};do
python synthetic_dataset.py $size
python load_dataset.py
done
94 changes: 94 additions & 0 deletions src/benchmarks/generate_dataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
#!/usr/bin/python
#coding: utf-8
###########################################################################
##########################################################################
#
# mongodb-secure
# Copyright (C) 2016, Pedro Alves and Diego Aranha
# {pedro.alves, dfaranha}@ic.unicamp.br

# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
##########################################################################
##########################################################################
# This routine generates a random dataset. This has elements that symbolize a
# contact list.
#
# A parameter may be passed specifying the quantity of elements in the dataset.
# Otherwise this number will be set to 10^4/2.
#

from random import randint,shuffle
import loremipsum
import json
import sys

outputfilename = "synthetic_dataset.json"

# Quantity of entries

if len(sys.argv) > 1:
N = int(sys.argv[1])
else:
N = pow(10,4)/2

ages = range(0,N)
# seed
male_firstnames = (['Tanner','Leland','Noble','Elroy','Irvin','Monty','Michael','Kristopher','Gonzalo','Elmo','Britt','Demarcus','Joey','Russ','Ismael','Dale','Santos','Melvin','Clifford','Cedrick','Jamel','Gustavo','Duncan','Wyatt','Otha','Kim','Lane','Trinidad','Kareem','Marcos','Edwardo','Cristopher','Jeramy','Willis','Ronnie','Maxwell','Carmelo','Rudy','Ulysses','Norbert','Isreal','Andreas','Odell','Heath','Palmer','Brenton','Whitney','Herman','Clinton','Alphonso'])
female_firstnames = (['Colette','Kayleen','Suellen','Serena','Linsey','Aline','Carry','Pearlene','Merlyn','Cindy','Tanja','Elvie','Yung','Lahoma','Kayce','Taryn','Eufemia','Karyn','Chantelle','Indira','Katerine','Hue','Margeret','Ila','Beaulah','Selene','Ora','Krystal','Jeana','Devorah','Adele','Amberly','Susannah','Johna','Danica','Dulce','Kami','Janiece','Cleotilde','Venita','Shenika','Sharyn','Patrica','Gilma','Ivey','Fidela','Anamaria','Londa','Pearline','Elfriede'])
surnames = (['Alves','Tallman','Wiechmann','Newcomb','Leatherwood','Teneyck','Dworkin','Bloomfield','Weinberger','Holtman','Li','Ruder','Asuncion','Holden','Haug','Bonfiglio','Hayles','Roseborough','Bensen','Maliszewski','Durgin','Mitten','Mcfall','Greenidge','Nowakowski','Au','Malcolm','Dail','Warthen','Schubert','Ballantine','Depaul','Roa','Figueredo','Dotson','Marrow','Cusic','Dalpiaz','Geddie','Graham','Mark','Hymel','Laroque','Chapple','Pasek','Rustin','Burg','Romo','Burnham','Runge','Hawkes','Lindgren','Aller','Carcamo','Winzer','Heishman','Mayne','Carmona','Villagomez','Biondi','Waterfield','Metzger','Kary','Bates','Benesh','Pleasant','Vong','Correira','Demoura','Gelinas','Alejos','Hulin','Sayegh','Hinkson','Wofford','Musselman','Mora','Dipiazza','Cliff','Barnhardt','Issa','Paula','Winkler','Lawhead','Murray','Scism','Cartagena','Oconner','Hermsen','Doten','Goldstein','Hites','Faivre','Hern','Grana','Lietz','Kawamura','Heard','Shaver','Tostado','Begum','Berthelot','Bakken','Bumgardner','Shroyer','Onstad','Martensen','Mcfall','Boling','Weil','Saur','Rubinstein','Visitacion','Concepcion','Claire','Ostlund','Augsburger','Gravley','Gao','Nixon','Espada','Malta','Dunkerson','Leija','Brimmer','Ozment','Opie','Olivarez','Raleigh','Marietta','Noss','Braz','Cribbs','Crooms','Merkley','Greenwood','Begay','Saban','Alcocer','Cerezo','Grasso','Kulpa','Mcneal','Heideman','Stong','Krogh','Giampaolo','Hullett','Belue','Bhatia','Brust'])
countries = (['Afghanistan','Albania','Algeria','American Samoa','Andorra','Angola','Anguilla','Antarctica','Antigua And Barbuda','Argentina','Armenia','Aruba','Australia','Austria','Azerbaijan','Bahamas','Bahrain','Bangladesh','Barbados','Belarus','Belgium','Belize','Benin','Bermuda','Bhutan','Bolivia','Bosnia And Herzegovina','Botswana','Bouvet Island','Brazil','British Indian Ocean Territory','Brunei Darussalam','Bulgaria','Burkina Faso','Burundi','Cambodia','Cameroon','Canada','Cape Verde','Cayman Islands','Central African Republic','Chad','Chile','China','Christmas Island','Cocos (keeling) Islands','Colombia','Comoros','Congo','Congo, The Democratic Republic Of The','Cook Islands','Costa Rica','Cote Divoire','Croatia','Cuba','Cyprus','Czech Republic','Denmark','Djibouti','Dominica','Dominican Republic','East Timor','Ecuador','Egypt','El Salvador','Equatorial Guinea','Eritrea','Estonia','Ethiopia','Falkland Islands (malvinas)','Faroe Islands','Fiji','Finland','France','French Guiana','French Polynesia','French Southern Territories','Gabon','Gambia','Georgia','Germany','Ghana','Gibraltar','Greece','Greenland','Grenada','Guadeloupe','Guam','Guatemala','Guinea','Guinea-bissau','Guyana','Haiti','Heard Island And Mcdonald Islands','Holy See (vatican City State)','Honduras','Hong Kong','Hungary','Iceland','India','Indonesia','Iran, Islamic Republic Of','Iraq','Ireland','Israel','Italy','Jamaica','Japan','Jordan','Kazakstan','Kenya','Kiribati','Korea, Democratic Peoples Republic Of Korea, Republic Of','Kosovo','Kuwait','Kyrgyzstan','Lao Peoples Democratic Republic','Latvia','Lebanon','Lesotho','Liberia','Libyan Arab Jamahiriya','Liechtenstein','Lithuania','Luxembourg','Macau','Macedonia, The Former Yugoslav Republic Of','Madagascar','Malawi','Malaysia','Maldives','Mali','Malta','Marshall Islands','Martinique','Mauritania','Mauritius','Mayotte','Mexico','Micronesia, Federated States Of','Moldova, Republic Of','Monaco','Mongolia','Montserrat','Montenegro','Morocco','Mozambique','Myanmar','Namibia','Nauru','Nepal','Netherlands','Netherlands Antilles','New Caledonia','New Zealand','Nicaragua','Niger','Nigeria','Niue','Norfolk Island','Northern Mariana Islands','Norway','Oman','Pakistan','Palau','Palestinian Territory, Occupied','Panama','Papua New Guinea','Paraguay','Peru','Philippines','Pitcairn','Poland','Portugal','Puerto Rico','Qatar','Reunion','Romania','Russian Federation','Rwanda','Saint Helena','Saint Kitts And Nevis','Saint Lucia','Saint Pierre And Miquelon','Saint Vincent And The Grenadines','Samoa','San Marino','Sao Tome And Principe','Saudi Arabia','Senegal','Serbia','Seychelles','Sierra Leone','Singapore','Slovakia','Slovenia','Solomon Islands','Somalia','South Africa','South Georgia And The South Sandwich Islands','Spain','Sri Lanka','Sudan','Suriname','Svalbard And Jan Mayen','Swaziland','Sweden','Switzerland','Syrian Arab Republic','Taiwan, Province Of China','Tajikistan','Tanzania, United Republic Of','Thailand','Togo','Tokelau','Tonga','Trinidad And Tobago','Tunisia','Turkey','Turkmenistan','Turks And Caicos Islands','Tuvalu','Uganda','Ukraine','United Arab Emirates','United Kingdom','United States','United States Minor Outlying Islands','Uruguay','Uzbekistan','Vanuatu','Venezuela','Viet Nam','Virgin Islands, British','Virgin Islands, U.s.','Wallis And Futuna','Western Sahara','Yemen','Zambia','Zimbabwe'])

# data model
# this object is not really necessary
model = { 'email':None,
'firstname':None,
'surname':None,
'country':None,
'age':None,
'text':None
}


# creates an empty dataset
print "Generating a dataset with %d elements..." % N,
shuffle(ages)
dataset = []
for i in xrange(N):
# clones
#print i

record = dict(model)

record["age"] = max((ages[i]+1) % 50,1) #
record["country"] = countries[randint(0,len(countries)-1)]
record["surname"] = surnames[randint(0,len(surnames)-1)]
record["text"] = "".join(loremipsum.get_paragraphs(randint(1,2)))

gender = randint(0,1)
if gender is 1:
#male
record["firstname"] = male_firstnames[randint(0,len(male_firstnames)-1)]
else:
#female
record["firstname"] = female_firstnames[randint(0,len(female_firstnames)-1)]

record["email"] = record["firstname"]+"."+record["surname"]+"@something.com"

dataset.append(record)

output = open(outputfilename,"w+")
json.dump(dataset,output)

print "Done."
print "Saved to %s" % outputfilename
134 changes: 134 additions & 0 deletions src/benchmarks/load_dataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
#!/usr/bin/python
# coding: utf-8
###########################################################################
##########################################################################
#
# mongodb-secure
# Copyright (C) 2016, Pedro Alves and Diego Aranha
# {pedro.alves, dfaranha}@ic.unicamp.br

# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
##########################################################################
##########################################################################
# This routine loads a dataset from a file named "synthetic_dataset.json" and generated
# by the script generate_dataset.py.
#
# After that, a database named "benchmark" is created as well as two collections
# named "encrypted" and "unencrypted". The dataset is inserted encrypted to the
# former and unencrypted in the latter and speed results are printed.
#

import json
from client import Client
from secmongo import SecMongo
from pymongo import MongoClient
from time import time
from index.indexnode import IndexNode
from index.avltree import AVLTree
import timeit

#
datafile = open("synthetic_dataset.json")
dataset = json.load(datafile)

nMax = max(set([x["age"] for x in dataset]))
n = nMax+1
print "Maximum integer supported by the ORE cryptosystem: %d" % n
client = Client(Client.keygen(),n=n)

client.set_attr("email","static")
client.set_attr("firstname","static")
client.set_attr("surname","static")
client.set_attr("country","static")
client.set_attr("age","index")
client.set_attr("text","static")

s = SecMongo(add_cipher_param=pow(client.ciphers["h_add"].keys["pub"]["n"],2))
s.open_database("benchmark")
s.set_collection("encrypted")
s.drop_collection()

print "%d items were loaded" % len(dataset)

#
client.encrypt(dataset[0])

dataset.sort(key=lambda x: x["age"])

def build_index(dataset):
root = AVLTree([dataset[0]["age"],0],nodeclass=IndexNode)
for i,doc in enumerate(dataset[1:]):
root = root.insert([doc["age"],i+1])

# assert it is correct
for data in dataset:
assert root.find(data["age"])
return root

def load_encrypted_data():
# Build a index
index = build_index(dataset)
index.encrypt(client.ciphers["index"])
encrypted_dataset = []
for data in dataset:
encrypted_dataset.append(client.encrypt(data))

s.insert_indexed(index,encrypted_dataset)

diff = timeit.timeit("load_encrypted_data()",setup="from __main__ import load_encrypted_data",number=1)
print "Encrypted data loaded in %fs - %f elements/s" % (diff,len(dataset)/(diff))

def encrypted_query():
return s.find(index=client.get_ctL(nMax))

diff = timeit.timeit("encrypted_query()",setup="from __main__ import encrypted_query",number=100)
print "Encrypted query in %fs" % (diff)

def load_data():
count = 0
start = time()
for entry in dataset:
if (count % 1000) == 0:
#
# print "%d - %f elements/s" % (count, 1000/(time()-start))
start = time()
count = count + 1
collection.insert(entry)

unencrypted_client = MongoClient()
db = unencrypted_client["benchmark"]
collection = db["unencrypted"]
collection.drop()

diff = timeit.timeit("load_data()",setup="from __main__ import load_data",number=1)
print "Unencrypted data loaded in %fs - %f elements/s" % (diff,len(dataset)/(diff))

# Create index
collection.create_index([("age", SecMongo.ASCENDING)])
def query(predicate,projection):
result = []
for x in s.find(client.get_ibe_sk(predicate),projection=projection):
result.append(x)
return result

def query_range(predicate,projection):
result = []
for x in s.find(sort=[("age",SecMongo.DESCENDING)],projection=projection):
result.append(x)
return result

def unencrypted_query():
return collection.find_one({"age":nMax})
diff = timeit.timeit("unencrypted_query()",setup="from __main__ import unencrypted_query",number=100)
print "Unencrypted query in %fs" % (diff)
36 changes: 36 additions & 0 deletions src/intperm.py/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
*.py[cod]

# C extensions
*.so

# Packages
*.egg
*.egg-info
dist
build
eggs
parts
bin
var
sdist
develop-eggs
.installed.cfg
lib
lib64
__pycache__

# Installer logs
pip-log.txt

# Unit test / coverage reports
.coverage
.tox
nosetests.xml

# Translations
*.mo

# Mr Developer
.mr.developer.cfg
.project
.pydevproject
17 changes: 17 additions & 0 deletions src/intperm.py/.travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
language: python

python:
- 2.6
- 2.7
- 3.2
- 3.3
- pypy

install:
- pip install nose coverage coveralls .
script:
- nosetests
after_success:
- mkdir -p build/lib
- coverage run --source=permutation.py setup.py -q test
- coveralls
24 changes: 24 additions & 0 deletions src/intperm.py/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
This is free and unencumbered software released into the public domain.

Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.

In jurisdictions that recognize copyright laws, the author or authors
of this software dedicate any and all copyright interest in the
software to the public domain. We make this dedication for the benefit
of the public at large and to the detriment of our heirs and
successors. We intend this dedication to be an overt act of
relinquishment in perpetuity of all present and future rights to this
software under copyright law.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.

For more information, please refer to <http://unlicense.org>
Loading

0 comments on commit 1761dab

Please # to comment.