This repository has been archived by the owner on Apr 16, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 8773e57
Showing
49 changed files
with
34,376 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,259 @@ | ||
# Created by https://www.toptal.com/developers/gitignore/api/python,visualstudiocode,linux,macos,windows | ||
# Edit at https://www.toptal.com/developers/gitignore?templates=python,visualstudiocode,linux,macos,windows | ||
|
||
### Linux ### | ||
*~ | ||
|
||
# temporary files which can be created if a process still has a handle open of a deleted file | ||
.fuse_hidden* | ||
|
||
# KDE directory preferences | ||
.directory | ||
|
||
# Linux trash folder which might appear on any partition or disk | ||
.Trash-* | ||
|
||
# .nfs files are created when an open file is removed but is still being accessed | ||
.nfs* | ||
|
||
### macOS ### | ||
# General | ||
.DS_Store | ||
.AppleDouble | ||
.LSOverride | ||
|
||
# Icon must end with two \r | ||
Icon | ||
|
||
|
||
# Thumbnails | ||
._* | ||
|
||
# Files that might appear in the root of a volume | ||
.DocumentRevisions-V100 | ||
.fseventsd | ||
.Spotlight-V100 | ||
.TemporaryItems | ||
.Trashes | ||
.VolumeIcon.icns | ||
.com.apple.timemachine.donotpresent | ||
|
||
# Directories potentially created on remote AFP share | ||
.AppleDB | ||
.AppleDesktop | ||
Network Trash Folder | ||
Temporary Items | ||
.apdisk | ||
|
||
### macOS Patch ### | ||
# iCloud generated files | ||
*.icloud | ||
|
||
### Python ### | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
share/python-wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
MANIFEST | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.nox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*.cover | ||
*.py,cover | ||
.hypothesis/ | ||
.pytest_cache/ | ||
cover/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
local_settings.py | ||
db.sqlite3 | ||
db.sqlite3-journal | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
.pybuilder/ | ||
target/ | ||
|
||
# Jupyter Notebook | ||
.ipynb_checkpoints | ||
|
||
# IPython | ||
profile_default/ | ||
ipython_config.py | ||
|
||
# pyenv | ||
# For a library or package, you might want to ignore these files since the code is | ||
# intended to run in multiple environments; otherwise, check them in: | ||
# .python-version | ||
|
||
# pipenv | ||
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. | ||
# However, in case of collaboration, if having platform-specific dependencies or dependencies | ||
# having no cross-platform support, pipenv may install dependencies that don't work, or not | ||
# install all needed dependencies. | ||
#Pipfile.lock | ||
|
||
# poetry | ||
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. | ||
# This is especially recommended for binary packages to ensure reproducibility, and is more | ||
# commonly ignored for libraries. | ||
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control | ||
#poetry.lock | ||
|
||
# pdm | ||
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. | ||
#pdm.lock | ||
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it | ||
# in version control. | ||
# https://pdm.fming.dev/#use-with-ide | ||
.pdm.toml | ||
|
||
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm | ||
__pypackages__/ | ||
|
||
# Celery stuff | ||
celerybeat-schedule | ||
celerybeat.pid | ||
|
||
# SageMath parsed files | ||
*.sage.py | ||
|
||
# Environments | ||
.env | ||
.venv | ||
env/ | ||
venv/ | ||
ENV/ | ||
env.bak/ | ||
venv.bak/ | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
.spyproject | ||
|
||
# Rope project settings | ||
.ropeproject | ||
|
||
# mkdocs documentation | ||
/site | ||
|
||
# mypy | ||
.mypy_cache/ | ||
.dmypy.json | ||
dmypy.json | ||
|
||
# Pyre type checker | ||
.pyre/ | ||
|
||
# pytype static type analyzer | ||
.pytype/ | ||
|
||
# Cython debug symbols | ||
cython_debug/ | ||
|
||
# PyCharm | ||
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can | ||
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore | ||
# and can be added to the global gitignore or merged into this file. For a more nuclear | ||
# option (not recommended) you can uncomment the following to ignore the entire idea folder. | ||
#.idea/ | ||
|
||
### VisualStudioCode ### | ||
.vscode/* | ||
!.vscode/settings.json | ||
!.vscode/tasks.json | ||
!.vscode/launch.json | ||
!.vscode/extensions.json | ||
!.vscode/*.code-snippets | ||
|
||
# Local History for Visual Studio Code | ||
.history/ | ||
|
||
# Built Visual Studio Code Extensions | ||
*.vsix | ||
|
||
### VisualStudioCode Patch ### | ||
# Ignore all local history of files | ||
.history | ||
.ionide | ||
|
||
### Windows ### | ||
# Windows thumbnail cache files | ||
Thumbs.db | ||
Thumbs.db:encryptable | ||
ehthumbs.db | ||
ehthumbs_vista.db | ||
|
||
# Dump file | ||
*.stackdump | ||
|
||
# Folder config file | ||
[Dd]esktop.ini | ||
|
||
# Recycle Bin used on file shares | ||
$RECYCLE.BIN/ | ||
|
||
# Windows Installer files | ||
*.cab | ||
*.msi | ||
*.msix | ||
*.msm | ||
*.msp | ||
|
||
# Windows shortcuts | ||
*.lnk | ||
|
||
# End of https://www.toptal.com/developers/gitignore/api/python,visualstudiocode,linux,macos,windows |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
{ | ||
"editor.rulers": [ | ||
120 | ||
] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# This CITATION.cff file was generated with cffinit. | ||
# Visit https://bit.ly/cffinit to generate yours today! | ||
|
||
cff-version: 1.2.0 | ||
title: >- | ||
Effective and Efficient Ranking using a Dual Encoder | ||
Approach | ||
message: >- | ||
If you use this software, please cite it using the | ||
metadata from this file. | ||
type: thesis | ||
authors: | ||
- given-names: Tim | ||
family-names: Hagen | ||
repository-code: >- | ||
https://github.com/TheMrSheldon/Effective-and-Efficient-Ranking-using-a-Dual-Encoder-Approach | ||
abstract: >- | ||
Ever since BERT's inception, the Information Retrieval | ||
community has worked on harnessing BERT's potential for | ||
relevance ranking. To this end, the most prevalent | ||
approaches are distilling BERT into smaller transformer | ||
architectures or designing efficient ranking architectures | ||
around (distilled versions of) BERT. We propose replacing | ||
MMSE-ColBERT's document encoder by distilling it into a | ||
vastly smaller, graph-based architecture using a modified | ||
version of TinyBERT's loss objective. Our architecture | ||
creates an initial graph-of-word that is then refined | ||
using multiple heads of Graph Structure Learning. | ||
Empirically, we find that the smallest variant of our | ||
architecture works best. It consists of a single GAT-layer | ||
and three GCN-layers. The modified version of TinyBERT's | ||
loss objective is competitive with strong baselines, like | ||
MMSE-ColBERT, but does not beat them. By using Margin-MSE | ||
loss instead, we can further significantly improve | ||
effectiveness such that our model beats every baseline | ||
except the strongest, MMSE-ColBERT with query expansion. | ||
Due to its simplicity, our model is three times as fast as | ||
MMSE-ColBERT's document encoding. Our experiments show | ||
promise that our model can further be used for document | ||
encoding and to replace the query encoding of MMSE-ColBERT | ||
as well for more efficient ranking. | ||
keywords: | ||
- BERT | ||
- re-ranking | ||
- cross-architecture knowledge distillation | ||
- graph neural networks | ||
license: GPL-3.0 |
Oops, something went wrong.