This module provides a 'ready to use' KAF wrapper for English Constituent Parser (including head words) using Stanford CoreNLP API (http://www-nlp.stanford.edu/software/)
It also provides options to format output according to the Stanford CoreNLP API (penn, oneline).
All dependencies and classpath configurations are automatically managed by Maven.
The contents of the module are the following:
+ pom.xml maven pom file which deals with everything related to compilation and execution of the module
+ src/ java source code of the module
+ Furthermore, the installation process, as described in the README.md, will generate another directory:
target/ it contains binary executable and other directories
Installing stanford-parse requires the following steps:
If you already have installed in your machine JDK7 and MAVEN 3, please go to step 3 directly. Otherwise, follow these steps:
- Install JDK 1.7
If you do not install JDK 1.7 in a default location, you will probably need to configure the PATH in .bashrc or .bash_profile:
export JAVA_HOME=/yourpath/local/java7
export PATH=${JAVA_HOME}/bin:${PATH}
If you use tcsh you will need to specify it in your .login as follows:
setenv JAVA_HOME /usr/java/java17
setenv PATH ${JAVA_HOME}/bin:${PATH}
If you re-login into your shell and run the command
java -version
You should now see that your jdk is 1.7
- Install MAVEN 3
Download MAVEN 3 from
wget http://apache.rediris.es/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz
Now you need to configure the PATH. For Bash Shell:
export MAVEN_HOME=/home/ragerri/local/apache-maven-3.0.5
export PATH=${MAVEN_HOME}/bin:${PATH}
For tcsh shell:
setenv MAVEN3_HOME ~/local/apache-maven-3.0.5
setenv PATH ${MAVEN3}/bin:{PATH}
If you re-login into your shell and run the command
mvn -version
You should see reference to the MAVEN version you have just installed plus the JDK 7 that is using.
- Get module source code
hg clone ssh://hg@bitbucket.org/ragerri/stanford-parse-en
- Move into main directory
cd stanford-parse
- Install module using maven
Obtain the Parser model englishPCFG.ser.gz model included in the Stanford parser at the the Stanford NLP Group site:
http://www-nlp.stanford.edu/software/lex-parser.shtml
After you unzip the downloaded module look for the stanford-parser-3.2.0-models.jar in the parser directory and extract the models:
jar xf stanford-parser-3.2.0-models.jar
and copy it to stanford-parse/src/main/resources/ directory:
cp edu/models/stanford/nlp/models/lexparser/englishPCFG.ser.gz stanford-parse/src/main/resources/
Compile package:
mvn clean package
This step will create a directory called target/ which contains various directories and files. Most importantly, there you will find the module executable:
stanford-parse-3.2.0.jar
The version reflects the API version of Stanford-CoreNLP used to create this module.
This executable contains every dependency the module needs, so it is completely portable as long as you have a JVM 1.7 installed.
To install the module as in the maven's user local repository, located in ~/.m2/repository, do this:
mvn clean install
- USING stanford-parse ========================
The program accepts KAF ( and elements) as input and outputs elements in KAF containing the parse trees:
https://github.com/opener-project/kaf/wiki/KAF-structure-overview
To run the program execute:
cat file.kaf | java -jar $PATH/target/stanford-parse-3.2.0.jar --kaf
It also provides Collins and Semantic Head Words and other output formats such as Penn TreeBank. Check the tool help for a description:
java -jar $PATH/target/stanford-parse-3.2.0.jar -help
You can also generate the javadoc of the module by executing:
mvn javadoc:jar
Which will create a jar file core/target/stanford-parse-3.2.0-javadoc.jar
Rodrigo Agerri
IXA NLP Group
University of the Basque Country (UPV/EHU)
E-20018 Donostia-San Sebastián
rodrigo.agerri@ehu.es