Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

docs: explain Go symbol hash #7

Merged
merged 2 commits into from
Oct 31, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,10 @@
- `Stripped`: scan files that may be executable and report whether they are a Go executable that has had its symbols stripped.
- `ImportHash`: calculate the [imphash](https://www.fireeye.com/blog/threat-research/2014/01/tracking-malware-import-hashing.html) of an executable with dynamic imports.
- `GoSymbolHash`: calculate an imphash analogue for Go executables compiled by the gc-compiler.
- `Sections`: provide section size statistics for an executable.

The `GoSymbolHash` algorithm is analogous to the algorithm described for `ImportHash` with the exception that Go's static symbols are used in place of the dynamic import symbols used by `ImportHash`.

The list of symbols referenced by the executable is obtained and the MD5 hash of the ordered list of symbols, separated by commas, is calculated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this in the order that they are listed in the binary?

Copy link
Collaborator Author

@efd6 efd6 Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not specified in the original description of imphash (irritatingly), but is shown in the reference implementation. Yes, it is in appearance order in the binary without additional lexical ordering. Given that I find it irritating that it's not mentioned in imphash, I'll add that detail here.

The order of the symbols is as exists in the executable and returned by the Go standard library debug packages.
The fully qualified import path of each symbol is included and while symbols used by `ImportHash` are canonicalised to lowercase, `GoSymbolHash` retains the case of the original symbol. `GoSymbolHash` may be calculated including or excluding standard library imports.
- `Sections`: provide section size and entropy statistics for an executable.
17 changes: 15 additions & 2 deletions toutoumomoma.go
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,10 @@ func (f *File) Stripped() (sneaky bool, err error) {
// Darwin imports are the list of symbols without a library prefix and is equivalent
// to the Anomali SymHash https://www.anomali.com/blog/symhash.
//
// The algorithm obtains the list of imported function names and converts them to all
// lowercase. Any file extension is removed and then the MD5 hash of the ordered list of
// symbols, separated by commas, is calculated.
//
// Darwin:
// ___error
// __exit
Expand Down Expand Up @@ -202,8 +206,17 @@ func (f *File) ImportHash() (hash []byte, imports []string, err error) {
// from the Go standard library are included, otherwise only third-party symbols
// are considered.
//
// If the file at is an executable, but not a gc-compiled Go executable,
// ErrNotGoExecutable will be returned.
// The algorithm is analogous to the algorithm described for ImportHash with the exception
// that Go's static symbols are used in place of the dynamic import symbols used by the
// ImportHash. The list of symbols referenced by the executable is obtained and the MD5 hash
// of the ordered list of symbols, separated by commas, is calculated. The order of the
// symbols is as exists in the executable and returned by the standard library debug packages
// The fully qualified import path of each symbol is included and while symbols used by
// ImportHash are canonicalised to lowercase, GoSymbolHash retains the case of the original
// symbol.
//
// If the file is an executable, but not a gc-compiled Go executable, ErrNotGoExecutable
// will be returned.
func (f *File) GoSymbolHash(stdlib bool) (hash []byte, imports []string, err error) {
ok, err := f.isGoExecutable()
if !ok || err != nil {
Expand Down