-
Notifications
You must be signed in to change notification settings - Fork 762
Repository synchronization
While by itself OpenGrok does not provide a way how to synchronize repositories it is shipped with a set of Python scripts that make it easy to both synchronize and reindex.
The script synchronizes the repositories of projects by running appropriate commands (e.g. git pull
for Git). While it can run perfectly fine standalone, it is meant to be run from within opengrok-sync
(see above).
The script accepts the configuration either in JSON or YAML.
The script assumes that OpenGrok is setup with projects (i.e. use the -P
indexer option).
It can be used within the opengrok-sync
script - see https://github.com/OpenGrok/OpenGrok/wiki/Per-project-management-and-workflow for more details.
The configuration file contents in YML can look e.g. like this:
#
# Commands (or paths - for specific repository types only)
#
commands:
hg: /usr/bin/hg
svn: /usr/bin/svn
teamware: /ontools/onnv-tools-i386/teamware/bin
#
# The proxy environment variables will be set for a project's repositories
# if the 'proxy' property is True.
#
proxy:
http_proxy: proxy.example.com:80
https_proxy: proxy.example.com:80
ftp_proxy: proxy.example.com:80
no_proxy: example.com,foo.example.com
hookdir: /tmp/hooks
# per-project hooks relative to 'hookdir' above
logdir: /tmp/logs
command_timeout: 300
hook_timeout: 1200
#
# Per project configuration.
#
projects:
http:
proxy: true
opengrok-stable:
disabled: true
userland:
proxy: true
hook_timeout: 3600
hooks:
pre: userland-pre.ksh
post: userland-post.ksh
opengrok-master:
ignored_repos:
- testdata/repositories/*
jdk.*:
proxy: true
hooks:
post: jdk_post.sh
In the above config, the userland
project will be run with environment variables in the proxy
section, plus it will also run scripts specified in the hook
section before and after all its repositories are synchronized. The hook scripts will be run with the current working directory set to that of the project.
The opengrok-master
project contains a RCS repository that would make the mirroring fail (since opengrok-mirror
does not support RCS yet) so it is marked as ignored.
Just like opengrok-sync
, opengrok-mirror
also queries the web app for various properties, so if the web application is not listening on default host/port, the URI location has to be specified using the -U option.
Multiple projects can share the same configuration using regular expressions as demonstrated with the jdk.*
pattern in the above configuration. The patterns are matched from top to the bottom of the configuration file, first match wins.
The opengrok-stable
project is marked as disabled. This means that the opengrok-mirror
script will exit with special value of 2 that is interpreted by the opengrok-sync
script to avoid any reindex. It is not treated as an error.
Some repositories under the project are not meant to be synchronized (e. g. the remote does not exist anymore or it is a testing repository for tests in that project). opengrok-mirror
can ignore them if you provide them in the ignored_repos
list. This is a list of paths relative to the matched project (see project-matching) and supports filename glob expansion (see the example).
In batch mode, log messages will be written to a log file under the logdir
directory specified in the configuration and rotated for each run, up to default count (8) or count specified using the --backupcount
option.
If pre and post mirroring hooks are specified, they are run before and after project synchronization. If any of the hooks fail, the program is immediately terminated. However, if the synchronization (that is run in between the hook scripts) fails, the post hook will be executed anyway. This is done so that the project is in sane state - usually the post hook which is used to apply extract source archives and apply patches. If the pre hook is used to clean up the extracted work and project synchronization failed, the project would be left barebone.
Both repository synchronization commands and hooks can have a timeout. By default there is no timeout, unless specified in the configuration file. There are global and per project timeouts, the latter overriding the former. For instance, in the above configuration file, the userland
project overrides global hook timeout to 1 hour while inheriting the command timeout.
It is possible to configure a command to be called/executed for disabled projects. Like with opengrok-sync
this supports both RESTful API calls as well as command execution. This allows for instance to tag the disabled projects with Messages so they are annotated in the UI (set the duration to be less than mirroring/syncing period to avoid duplicating messages).
The disabled command is configured globally and will vary based on project thanks to pattern substitution/append.
Any failures in disabled command processing are logged and do not change the overall result of the mirroring command.
Command examples:
disabled_command:
command:
- http://localhost:8080/source/api/v1/messages
- POST
- cssClass: info
duration: PT1M
tags: ['%PROJECT%']
text: disabled project
projects:
foo:
disabled: true
disabled_command:
command: [cat]
projects:
foo:
disabled: true