First commit.

- Code made public under a GPL license.
pdroalves · Sep 6, 2016 · 1761dab · 1761dab
commit 1761dab
Show file tree

Hide file tree

Showing 45 changed files with 3,986 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,11 @@
+*.o
+*.png
+*.out
+*.pyc
+*.dat
+*.swp __pycache__
+*.a
+*.so
+*.pyc
+*.swp
+*~
diff --git a/COPYING.md b/COPYING.md
diff --git a/README.md b/README.md
@@ -0,0 +1,36 @@
+# mongodb-secure
+
+[University of Campinas](http://www.unicamp.br), [Institute of Computing](http://www.ic.unicamp.br), Brazil.
+
+Laboratory of Security and Cryptography - [LASCA](http://www.lasca.ic.unicamp.br),<br>
+Laboratório Multidisciplinar de Computação de Alto Desempenho - [LMCAD](http://www.lmcad.ic.unicamp.br). <br>
+
+Author: [Pedro G. M. R. Alves](http://www.iampedro.com), PhD. student @ IC-UNICAMP,<br/>
+Advisor: [Diego F. Aranha](http://www.ic.unicamp.br/~dfaranha). <br/>
+
+## About
+
+This is a proof of concept implementation of the framework proposed by [Alves and Aranha (2016)] with the purpose of offering a wrapper on MongoDB's Python driver that enables a application to store and query encrypted data on the database.
+
+## Citing
+Please cite using the template below:
+
+	@INPROCEEDINGS{Alves2016,
+ 		author = {{Alves, Pedro and Aranha, Diego}},
+  		title = {{A framework for searching encrypted databases}},
+  		year = 2016,
+  		BOOKTITLE= {{Anais do XVI Simpósio Brasileiro em Segurança da Informação e de Sistemas Computacionais (SBSeg 2016)}},
+	}
+
+
+# Licensing
+
+mongodb-secure is released under an GPLv3 license.
+
+## Disclaimer
+
+This is a proof of concept implementation. We do not recommend the use of this code on production and we do no claim that the cipher implementations here provided are correct or secure. Use at your own risk.
+
+
+**Privacy Warning:** This site tracks visitor information.
+
diff --git a/src/benchmarks/benchmark.sh b/src/benchmarks/benchmark.sh
@@ -0,0 +1,7 @@
+#!/bin/bash
+#
+sizes=(100 1000 10000 100000)
+for size in ${sizes[@]};do
+	python synthetic_dataset.py $size
+	python load_dataset.py
+done
diff --git a/src/benchmarks/generate_dataset.py b/src/benchmarks/generate_dataset.py
@@ -0,0 +1,94 @@
+#!/usr/bin/python
+#coding: utf-8
+###########################################################################
+##########################################################################
+#
+# mongodb-secure
+# Copyright (C) 2016, Pedro Alves and Diego Aranha
+# {pedro.alves, dfaranha}@ic.unicamp.br
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+##########################################################################
+##########################################################################
+# This routine generates a random dataset. This has elements that symbolize a 
+# contact list.
+#
+# A parameter may be passed specifying the quantity of elements in the dataset.
+# Otherwise this number will be set to 10^4/2.
+#
+
+from random import randint,shuffle
+import loremipsum
+import json
+import sys
+
+outputfilename = "synthetic_dataset.json"
+
+# Quantity of entries
+
+if len(sys.argv) > 1:
+	N = int(sys.argv[1])
+else:
+	N = pow(10,4)/2
+
+ages = range(0,N)
+# seed
+male_firstnames = (['Tanner','Leland','Noble','Elroy','Irvin','Monty','Michael','Kristopher','Gonzalo','Elmo','Britt','Demarcus','Joey','Russ','Ismael','Dale','Santos','Melvin','Clifford','Cedrick','Jamel','Gustavo','Duncan','Wyatt','Otha','Kim','Lane','Trinidad','Kareem','Marcos','Edwardo','Cristopher','Jeramy','Willis','Ronnie','Maxwell','Carmelo','Rudy','Ulysses','Norbert','Isreal','Andreas','Odell','Heath','Palmer','Brenton','Whitney','Herman','Clinton','Alphonso'])
+female_firstnames = (['Colette','Kayleen','Suellen','Serena','Linsey','Aline','Carry','Pearlene','Merlyn','Cindy','Tanja','Elvie','Yung','Lahoma','Kayce','Taryn','Eufemia','Karyn','Chantelle','Indira','Katerine','Hue','Margeret','Ila','Beaulah','Selene','Ora','Krystal','Jeana','Devorah','Adele','Amberly','Susannah','Johna','Danica','Dulce','Kami','Janiece','Cleotilde','Venita','Shenika','Sharyn','Patrica','Gilma','Ivey','Fidela','Anamaria','Londa','Pearline','Elfriede'])
+surnames = (['Alves','Tallman','Wiechmann','Newcomb','Leatherwood','Teneyck','Dworkin','Bloomfield','Weinberger','Holtman','Li','Ruder','Asuncion','Holden','Haug','Bonfiglio','Hayles','Roseborough','Bensen','Maliszewski','Durgin','Mitten','Mcfall','Greenidge','Nowakowski','Au','Malcolm','Dail','Warthen','Schubert','Ballantine','Depaul','Roa','Figueredo','Dotson','Marrow','Cusic','Dalpiaz','Geddie','Graham','Mark','Hymel','Laroque','Chapple','Pasek','Rustin','Burg','Romo','Burnham','Runge','Hawkes','Lindgren','Aller','Carcamo','Winzer','Heishman','Mayne','Carmona','Villagomez','Biondi','Waterfield','Metzger','Kary','Bates','Benesh','Pleasant','Vong','Correira','Demoura','Gelinas','Alejos','Hulin','Sayegh','Hinkson','Wofford','Musselman','Mora','Dipiazza','Cliff','Barnhardt','Issa','Paula','Winkler','Lawhead','Murray','Scism','Cartagena','Oconner','Hermsen','Doten','Goldstein','Hites','Faivre','Hern','Grana','Lietz','Kawamura','Heard','Shaver','Tostado','Begum','Berthelot','Bakken','Bumgardner','Shroyer','Onstad','Martensen','Mcfall','Boling','Weil','Saur','Rubinstein','Visitacion','Concepcion','Claire','Ostlund','Augsburger','Gravley','Gao','Nixon','Espada','Malta','Dunkerson','Leija','Brimmer','Ozment','Opie','Olivarez','Raleigh','Marietta','Noss','Braz','Cribbs','Crooms','Merkley','Greenwood','Begay','Saban','Alcocer','Cerezo','Grasso','Kulpa','Mcneal','Heideman','Stong','Krogh','Giampaolo','Hullett','Belue','Bhatia','Brust'])
+countries = (['Afghanistan','Albania','Algeria','American Samoa','Andorra','Angola','Anguilla','Antarctica','Antigua And Barbuda','Argentina','Armenia','Aruba','Australia','Austria','Azerbaijan','Bahamas','Bahrain','Bangladesh','Barbados','Belarus','Belgium','Belize','Benin','Bermuda','Bhutan','Bolivia','Bosnia And Herzegovina','Botswana','Bouvet Island','Brazil','British Indian Ocean Territory','Brunei Darussalam','Bulgaria','Burkina Faso','Burundi','Cambodia','Cameroon','Canada','Cape Verde','Cayman Islands','Central African Republic','Chad','Chile','China','Christmas Island','Cocos (keeling) Islands','Colombia','Comoros','Congo','Congo, The Democratic Republic Of The','Cook Islands','Costa Rica','Cote Divoire','Croatia','Cuba','Cyprus','Czech Republic','Denmark','Djibouti','Dominica','Dominican Republic','East Timor','Ecuador','Egypt','El Salvador','Equatorial Guinea','Eritrea','Estonia','Ethiopia','Falkland Islands (malvinas)','Faroe Islands','Fiji','Finland','France','French Guiana','French Polynesia','French Southern Territories','Gabon','Gambia','Georgia','Germany','Ghana','Gibraltar','Greece','Greenland','Grenada','Guadeloupe','Guam','Guatemala','Guinea','Guinea-bissau','Guyana','Haiti','Heard Island And Mcdonald Islands','Holy See (vatican City State)','Honduras','Hong Kong','Hungary','Iceland','India','Indonesia','Iran, Islamic Republic Of','Iraq','Ireland','Israel','Italy','Jamaica','Japan','Jordan','Kazakstan','Kenya','Kiribati','Korea, Democratic Peoples Republic Of Korea, Republic Of','Kosovo','Kuwait','Kyrgyzstan','Lao Peoples Democratic Republic','Latvia','Lebanon','Lesotho','Liberia','Libyan Arab Jamahiriya','Liechtenstein','Lithuania','Luxembourg','Macau','Macedonia, The Former Yugoslav Republic Of','Madagascar','Malawi','Malaysia','Maldives','Mali','Malta','Marshall Islands','Martinique','Mauritania','Mauritius','Mayotte','Mexico','Micronesia, Federated States Of','Moldova, Republic Of','Monaco','Mongolia','Montserrat','Montenegro','Morocco','Mozambique','Myanmar','Namibia','Nauru','Nepal','Netherlands','Netherlands Antilles','New Caledonia','New Zealand','Nicaragua','Niger','Nigeria','Niue','Norfolk Island','Northern Mariana Islands','Norway','Oman','Pakistan','Palau','Palestinian Territory, Occupied','Panama','Papua New Guinea','Paraguay','Peru','Philippines','Pitcairn','Poland','Portugal','Puerto Rico','Qatar','Reunion','Romania','Russian Federation','Rwanda','Saint Helena','Saint Kitts And Nevis','Saint Lucia','Saint Pierre And Miquelon','Saint Vincent And The Grenadines','Samoa','San Marino','Sao Tome And Principe','Saudi Arabia','Senegal','Serbia','Seychelles','Sierra Leone','Singapore','Slovakia','Slovenia','Solomon Islands','Somalia','South Africa','South Georgia And The South Sandwich Islands','Spain','Sri Lanka','Sudan','Suriname','Svalbard And Jan Mayen','Swaziland','Sweden','Switzerland','Syrian Arab Republic','Taiwan, Province Of China','Tajikistan','Tanzania, United Republic Of','Thailand','Togo','Tokelau','Tonga','Trinidad And Tobago','Tunisia','Turkey','Turkmenistan','Turks And Caicos Islands','Tuvalu','Uganda','Ukraine','United Arab Emirates','United Kingdom','United States','United States Minor Outlying Islands','Uruguay','Uzbekistan','Vanuatu','Venezuela','Viet Nam','Virgin Islands, British','Virgin Islands, U.s.','Wallis And Futuna','Western Sahara','Yemen','Zambia','Zimbabwe'])
+
+# data model
+# this object is not really necessary
+model = { 'email':None,
+		   'firstname':None,
+		   'surname':None,
+		   'country':None,
+		   'age':None,
+		   'text':None
+		}
+
+
+# creates an empty dataset
+print "Generating a dataset with %d elements..." % N,
+shuffle(ages)
+dataset = []
+for i in xrange(N):
+	# clones
+	#print i
+
+	record = dict(model)
+
+	record["age"] = max((ages[i]+1) % 50,1) # 
+	record["country"] = countries[randint(0,len(countries)-1)]
+	record["surname"] = surnames[randint(0,len(surnames)-1)]
+	record["text"] = "".join(loremipsum.get_paragraphs(randint(1,2)))
+
+	gender = randint(0,1)
+	if gender is 1:
+		#male
+		record["firstname"] = male_firstnames[randint(0,len(male_firstnames)-1)]
+	else:
+		#female
+		record["firstname"] = female_firstnames[randint(0,len(female_firstnames)-1)]
+
+	record["email"] =  record["firstname"]+"."+record["surname"]+"@something.com"
+
+	dataset.append(record)
+
+output = open(outputfilename,"w+")
+json.dump(dataset,output)
+
+print "Done."
+print "Saved to %s" % outputfilename
diff --git a/src/benchmarks/load_dataset.py b/src/benchmarks/load_dataset.py
@@ -0,0 +1,134 @@
+#!/usr/bin/python
+# coding: utf-8
+###########################################################################
+##########################################################################
+#
+# mongodb-secure
+# Copyright (C) 2016, Pedro Alves and Diego Aranha
+# {pedro.alves, dfaranha}@ic.unicamp.br
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+##########################################################################
+##########################################################################
+# This routine loads a dataset from a file named "synthetic_dataset.json" and generated
+# by the script generate_dataset.py.
+# 
+# After that, a database named "benchmark" is created as well as two collections 
+# named "encrypted" and "unencrypted". The dataset is inserted encrypted to the 
+# former and unencrypted in the latter and speed results are printed.
+# 
+
+import json
+from client import Client
+from secmongo import SecMongo
+from pymongo import MongoClient
+from time import time
+from index.indexnode import IndexNode
+from index.avltree import AVLTree
+import timeit
+
+#
+datafile = open("synthetic_dataset.json")
+dataset = json.load(datafile)
+
+nMax = max(set([x["age"] for x in dataset]))
+n = nMax+1
+print "Maximum integer supported by the ORE cryptosystem: %d" % n
+client = Client(Client.keygen(),n=n)
+
+client.set_attr("email","static")
+client.set_attr("firstname","static")
+client.set_attr("surname","static")
+client.set_attr("country","static")
+client.set_attr("age","index")
+client.set_attr("text","static")
+
+s = SecMongo(add_cipher_param=pow(client.ciphers["h_add"].keys["pub"]["n"],2))
+s.open_database("benchmark")
+s.set_collection("encrypted")
+s.drop_collection()
+
+print "%d items were loaded" % len(dataset)
+
+#
+client.encrypt(dataset[0])
+
+dataset.sort(key=lambda x: x["age"])
+
+def build_index(dataset):
+	root = AVLTree([dataset[0]["age"],0],nodeclass=IndexNode)
+	for i,doc in enumerate(dataset[1:]):
+		root = root.insert([doc["age"],i+1])
+
+	# assert it is correct
+	for data in dataset:
+		assert root.find(data["age"])
+	return root
+
+def load_encrypted_data():
+	# Build a index
+	index = build_index(dataset)
+	index.encrypt(client.ciphers["index"])
+	encrypted_dataset = []
+	for data in dataset:
+		encrypted_dataset.append(client.encrypt(data))
+
+	s.insert_indexed(index,encrypted_dataset)
+
+diff = timeit.timeit("load_encrypted_data()",setup="from __main__ import load_encrypted_data",number=1)
+print "Encrypted data loaded in %fs - %f elements/s" % (diff,len(dataset)/(diff))
+
+def encrypted_query():
+	return s.find(index=client.get_ctL(nMax))
+
+diff = timeit.timeit("encrypted_query()",setup="from __main__ import encrypted_query",number=100)
+print "Encrypted query in %fs" % (diff)
+
+def load_data():
+	count = 0
+	start = time()
+	for entry in dataset:
+		if (count % 1000) == 0:
+			# 
+			# print "%d - %f elements/s" % (count, 1000/(time()-start))
+			start = time()
+		count = count + 1
+		collection.insert(entry)
+
+unencrypted_client = MongoClient()
+db = unencrypted_client["benchmark"]
+collection = db["unencrypted"]
+collection.drop()
+
+diff = timeit.timeit("load_data()",setup="from __main__ import load_data",number=1)
+print "Unencrypted data loaded in %fs - %f elements/s" % (diff,len(dataset)/(diff))
+
+# Create index
+collection.create_index([("age", SecMongo.ASCENDING)])
+def query(predicate,projection):
+	result = []
+	for x in s.find(client.get_ibe_sk(predicate),projection=projection):
+		result.append(x)
+	return result
+
+def query_range(predicate,projection):
+	result = []
+	for x in s.find(sort=[("age",SecMongo.DESCENDING)],projection=projection):
+		result.append(x)
+	return result
+
+def unencrypted_query():
+	return collection.find_one({"age":nMax})
+diff = timeit.timeit("unencrypted_query()",setup="from __main__ import unencrypted_query",number=100)
+print "Unencrypted query in %fs" % (diff)
diff --git a/src/intperm.py/.gitignore b/src/intperm.py/.gitignore
@@ -0,0 +1,36 @@
+*.py[cod]
+
+# C extensions
+*.so
+
+# Packages
+*.egg
+*.egg-info
+dist
+build
+eggs
+parts
+bin
+var
+sdist
+develop-eggs
+.installed.cfg
+lib
+lib64
+__pycache__
+
+# Installer logs
+pip-log.txt
+
+# Unit test / coverage reports
+.coverage
+.tox
+nosetests.xml
+
+# Translations
+*.mo
+
+# Mr Developer
+.mr.developer.cfg
+.project
+.pydevproject
diff --git a/src/intperm.py/.travis.yml b/src/intperm.py/.travis.yml
@@ -0,0 +1,17 @@
+language: python
+
+python:
+  - 2.6
+  - 2.7
+  - 3.2
+  - 3.3
+  - pypy
+
+install:
+  - pip install nose coverage coveralls .
+script:
+  - nosetests
+after_success:
+  - mkdir -p build/lib
+  - coverage run --source=permutation.py setup.py -q test
+  - coveralls
diff --git a/src/intperm.py/LICENSE b/src/intperm.py/LICENSE
@@ -0,0 +1,24 @@
+This is free and unencumbered software released into the public domain.
+
+Anyone is free to copy, modify, publish, use, compile, sell, or
+distribute this software, either in source code form or as a compiled
+binary, for any purpose, commercial or non-commercial, and by any
+means.
+
+In jurisdictions that recognize copyright laws, the author or authors
+of this software dedicate any and all copyright interest in the
+software to the public domain. We make this dedication for the benefit
+of the public at large and to the detriment of our heirs and
+successors. We intend this dedication to be an overt act of
+relinquishment in perpetuity of all present and future rights to this
+software under copyright law.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+OTHER DEALINGS IN THE SOFTWARE.
+
+For more information, please refer to <http://unlicense.org>