Just some ideas from the CMU NLP course I'm taking, re-implemented in Haskell for my own understanding.
Don't expect this to be incredibly useful at the moment -- if there are useful pieces in this that I end up using, I'll probably end up moving them into seperate libraries. This is just a collection of probably fairly naive, innefficent implementations of some of the core NLP algorithms and techniques in Haskell for my own educational purposes.
I'll also store some links/notes/ideas here for my future reference. The following is a bit of a personal "awesome" list of NLP techniques and resources, with a bias towards non-standard language implementations (Haskell, Julia, Rust, Scala, etc...) to explore, and ideas to explore for combining probabalistic / bayesian / machine learning kinds of technqiues with semantic / ontological / logic / type-theory / cateogry-theory sorts of approaches -- as well as with an emphasis on problems I'd like to eventually solve for bedelibry and related personal projects.
But yeah, this is all MIT licensed, so if you find something useful, do what you want with it.
Note that this currently requires --allow-newer
in order to run due to the dependency on hmatrix-svdlibc
.
-
- Implemented in Java
-
- Has a Scala interface. :)
-
- This is a pre-existing interlinugal lexicon compatible with GF, which has many possible sense for each given word.
-
- Super efficent and parallelizable implementation in Haskell.
-
- Really easy approach to generating word embeddings. Would be really easy to port over to Haskell -- we could probably use something like repa or massiv to do the calculations.
- Type-Driven Neural Programming by Example
- This thesis shows an approach of how to structure a neural network for program synthesis-like tasks. A similar structure might be used in order to supply other types of tree-like (strucutred) data as the input or output of a neural net.
-
Semigroupoid Interfaces for Relation-Algebraic Programming in Haskell
- From the folks at Kiel (Curry). Seems like an interesting way to integrate logic programming into Haskell.
-
- This is a thesis that uses categorial methods to study morphology.
-
A Spectral Learning Algorithm for Finite State Transducers
- Here is a potential technique I might be able to use to derive FSTs from underlying data.
-
Generalizing inflection tables into paradigms with finite state operations
- This seems like a promising approach to deriving FSTs from standard inflectional tables. It uses a prolog program that generates a pure C transducer.
-
Learning Finite-State Models for Language Understanding
- This is a surprising application of transducers to me -- where natural language is translated into a (somewhat limited?) formal langauge via transducer.
-
Differentiable Weighted Finite-State Transducers
- This seems like an interesting approach to apply the techniques of transducers in a conventional (differentiable) ML context.
-
Bringing Machine Learning and Compositional Semantics Together
-
Neural Compositional Denotational Semantics for Question Answering
-
- An interesting use of proof nets for neurosymbolic parsing.
-
- This page is a great introductory resource, going over different approaches to typelogical grammar. It also talks about (at least one of) the relationships between typelogical grammar and linear logic, as well as referencing some approaches which might be good to look into for parsing.
-
Logic Analysis of Natural Language Based on Predicate Linear Logic
- This paper applies ideas from linear logic to natural language semantics -- possibly similar to some of my ideas on the topic. Regardless, this might at least be a good starting out point.
-
Montague semantics, nominalizations and Scott’s domains
- An application of domain theory to natural language semantics. The SEP article cites this as being an example of "property semantics" -- which is used to get past some of the limitations of conventional (non-structural) possible world semantics in higher-order sentences like "Mary likes loving John".
-
Controlled Natural Languages for Knowledge Representation
- This seems very similar in spirit to some of my motivations for Montague.
- See also the wikipedia page, which mentions an esperanto-based controlled natural language, although the emphasis does not seem to be on semantics.
-
Type-Driven Incremental Semantic Parsing with Polymorphism
- This is very similar to some of the ideas I've tried to apply in Montague. Both subtyping and polymorphism are used.
-
Dependent-Type-Theoretic Methods in Natural Language Semantics (ncat lab)
- This page is a great starting point for the literature on dependent type theory in natural language semantics. There's even references using homotopy type theory!
-
Compositional Semantics: An Introduction to the Syntax/Semantics Interface
- This is a fairly recent book (2014) on the topic which looks like it has some interesting topics.
-
- This looks like a good jumping off point to investigate applications of game-theoretic semantics in linguistics.
- Additionally, it seems like some connections to Wittgensten's "language games" are made here.
- It is unclear if there is anything here that directly uses game-theoretic semantics in a typelogical context. Compositional semantics for a language of imperfect information seems like another paper which might be in this direction (giving a compositional semantics to game-theoretic ideas), but again I am not sure if there has been a super concrete connection with type-logical approaches as of yet.
-
The Combinatory Morphemic Lexicon
- This paper looks at both English and Turkish morphology through the lens of categorial grammar.
- Apparently earlier approaches (such as the ones cited in Carpenter) are not very complete accounts, so this may be a better starting place.
-
Birelational Kripke semantics for an intuitionistic LTL
- This is more relevant to the work I'm doing in Montague. How do we implement a semantics for tenses? LTL seems like a natural fit.
- Carpenter actually covers this question in some depth.
-
Concepts in a Probabilistic Language of Thought
- A fascinating account of how a probabalistic programming langauge (Church) can be used to model "fuzzy" and scientific means of human reasoning, including providing a good account of counterfactuals (especially with some of the issues of typical modeling of counterfactuals as mentioned in Carpenter).
- There is even an entire book on this topic here, which uses a newer javascript-based version of Church. It would be interesting for Montague to see if we could build something similar to Church as an embedded DSL in Haskell.
-
Logic and pragmatics: Linear logic for inferential practice
- This paper looks like a very deep application of linear logic to the realm of pragmatics, related to Brandom's work.
- Although it does not look like it uses typological grammar, I believe it could be applied there.
- Notably, it looks like tihs make use of the full (i.e. disjunctive and conjunctive, additive and multiplicative) gamut of linear logic connectives.
-
On an intuitionistic logic for pragmatics
- Similarly to the above, this paper study pragmatics -- however, it makes use of a bi-intuintionistic logic rather than linear logic.
- This paper is interesting more generally in that apparently it looks at a computational interperation of co-intuitionism in terms of coroutines.
-
Human-level concept learning through probabilistic program induction
-
A Compositional Bayesian Semantics for Natural Language
- This is a really interesting looking paper. Notably, it is implemented in Haskell, uses typelogical methods, and in addition to using a bayesian semantics, also incorporates ideas from distributional semantics.
-
Bayesian Natural Language Semantics and Pragmatics
- Another reference to potentially explore the use of bayesian methods in compositional semantics.
-
- A probabalistic programming language implemented in Julia.
-
- Another probabalistic programming language, which also incorporates ideas from machine learning.
- This has some really interesting examples, such as Rational Speach Acts.
- Do existing Haskell-based probalisic programming DSLs like
monad-bayes
have the same capabilities as languages like Church, especially as showcased in languages like "Concepts in a Probabilistic Language of Thought"?
-
- What sort of NLP tasks might this be good for? What advantage do they have over regular transducers?
-
- A framework integrating differential logic programming with neural networks