diff --git a/oletools/README.html b/oletools/README.html index 2e40975b..5a3199ec 100644 --- a/oletools/README.html +++ b/oletools/README.html @@ -9,12 +9,24 @@
oletools is a package of python tools to analyze Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the olefile parser. See http://www.decalage.info/python/oletools for more info.
+oletools is a package of python tools to analyze Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the olefile parser. See http://www.decalage.info/python/oletools for more info.
Quick links: Home page - Download/Install - Documentation - Report Issues/Suggestions/Questions - Contact the Author - Repository - Updates on Twitter
Note: python-oletools is not related to OLETools published by BeCubed Software.
See the full changelog for more information.
+oletools are used by a number of projects and online malware analysis services, including Viper, REMnux, Hybrid-analysis.com, Joe Sandbox, Deepviz, Laika BOSS, Cuckoo Sandbox, Anlyz.io, pcodedmp and probably VirusTotal. (Please contact me if you have or know a project using oletools)
+oletools are used by a number of projects and online malware analysis services, including Viper, REMnux, FAME, Hybrid-analysis.com, Joe Sandbox, Deepviz, Laika BOSS, Cuckoo Sandbox, Anlyz.io, ViperMonkey, pcodedmp, dridex.malwareconfig.com, and probably VirusTotal. (Please contact me if you have or know a project using oletools)
To use python-oletools from the command line as analysis tools, you may simply download the latest release archive and extract the files into the directory of your choice.
-You may also download the latest development version with the most recent features.
-Another possibility is to use a git client to clone the repository (https://github.com/decalage2/oletools.git) into a folder. You can then update it easily in the future.
-If you plan to use python-oletools with other Python applications or your own scripts, then the simplest solution is to use "pip install oletools" or "easy_install oletools" to download and install in one go. Otherwise you may download/extract the zip archive and run "setup.py install".
-Important: to update oletools if it is already installed, you must run "pip install -U oletools", otherwise pip will not update it.
+The recommended way to download and install/update the latest stable release of oletools is to use pip:
+sudo -H pip install -U oletools
pip install -U oletools
This should automatically create command-line scripts to run each tool from any directory: olevba
, mraptor
, rtfobj
, etc.
To get the latest development version instead:
+sudo -H pip install -U https://github.com/decalage2/oletools/archive/master.zip
pip install -U https://github.com/decalage2/oletools/archive/master.zip
See the documentation for other installation options.
The latest version of the documentation can be found online, otherwise a copy is provided in the doc subfolder of the package.
The code is available in a GitHub repository. You may use it to submit enhancements using forks and pull requests.
This license applies to the python-oletools package, apart from the thirdparty folder which contains third-party files published with their own license.
-The python-oletools package is copyright (c) 2012-2016 Philippe Lagadec (http://www.decalage.info)
+The python-oletools package is copyright (c) 2012-2017 Philippe Lagadec (http://www.decalage.info)
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
This is the home page of the documentation for python-oletools. The latest version can be found online, otherwise a copy is provided in the doc subfolder of the package.
-python-oletools is a package of python tools to analyze Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the olefile parser. See http://www.decalage.info/python/oletools for more info.
+python-oletools is a package of python tools to analyze Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the olefile parser. See http://www.decalage.info/python/oletools for more info.
Quick links: Home page - Download/Install - Documentation - Report Issues/Suggestions/Questions - Contact the Author - Repository - Updates on Twitter
Note: python-oletools is not related to OLETools published by BeCubed Software.
This license applies to the python-oletools package, apart from the thirdparty folder which contains third-party files published with their own license.
-The python-oletools package is copyright (c) 2012-2016 Philippe Lagadec (http://www.decalage.info)
+The python-oletools package is copyright (c) 2012-2017 Philippe Lagadec (http://www.decalage.info)
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
License for officeparser | +License for officeparser |
Important: on Linux/MacOSX, always add double quotes around a file name when you use wildcards such as *
and ?
. Otherwise, the shell may replace the argument with the actual list of files matching the wildcards before starting the script.
As of v0.50, mraptor has been ported to Python 3 thanks to @sebdraven. However, the differences between Python 2 and 3 are significant and for now there is a separate version of mraptor named mraptor3 to be used with Python 3.
diff --git a/oletools/doc/olebrowse.html b/oletools/doc/olebrowse.html index 6a369ca2..348889cd 100644 --- a/oletools/doc/olebrowse.html +++ b/oletools/doc/olebrowse.html @@ -24,14 +24,17 @@Main menu, showing all streams in the OLE file:
Menu with actions for a stream:
Hex view for a stream:
oledir.py file.doc
First, import oletools.oleid, and create an OleID object to scan a file:
-import oletools.oleid
+import oletools.oleid
-oid = oletools.oleid.OleID(filename)
+oid = oletools.oleid.OleID(filename)
Note: filename can be a filename, a file-like object, or a bytes string containing the file to be analyzed.
Second, call the check() method. It returns a list of Indicator objects.
Each Indicator object has the following attributes:
@@ -90,11 +108,11 @@ How to use oleid in your P
- value: value of the indicator
For example, the following code displays all the indicators:
-indicators = oid.check()
-for i in indicators:
- print 'Indicator id=%s name="%s" type=%s value=%s' % (i.id, i.name, i.type, repr(i.value))
- print 'description:', i.description
- print ''
+indicators = oid.check()
+for i in indicators:
+ print 'Indicator id=%s name="%s" type=%s value=%s' % (i.id, i.name, i.type, repr(i.value))
+ print 'description:', i.description
+ print ''
See the source code of oleid.py for more details.
olemap.py file.doc
TODO
diff --git a/oletools/doc/olevba.html b/oletools/doc/olevba.html index 5ed27818..c718243d 100644 --- a/oletools/doc/olevba.html +++ b/oletools/doc/olevba.html @@ -7,23 +7,41 @@IMPORTANT: olevba is currently under active development, therefore this API is likely to change.
First, import the oletools.olevba package, using at least the VBA_Parser and VBA_Scanner classes:
-from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML
+from oletools.olevba import VBA_Parser, TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML, TYPE_MHTML
To parse a file on disk, create an instance of the VBA_Parser class, providing the name of the file to open as parameter. For example:
-vbaparser = VBA_Parser('my_file_with_macros.doc')
+vbaparser = VBA_Parser('my_file_with_macros.doc')
The file may also be provided as a bytes string containing its data. In that case, the actual filename must be provided for reference, and the file content with the data parameter. For example:
-myfile = 'my_file_with_macros.doc'
-filedata = open(myfile, 'rb').read()
-vbaparser = VBA_Parser(myfile, data=filedata)
+myfile = 'my_file_with_macros.doc'
+filedata = open(myfile, 'rb').read()
+vbaparser = VBA_Parser(myfile, data=filedata)
VBA_Parser will raise an exception if the file is not a supported format, such as OLE (MS Office 97-2003), OpenXML (MS Office 2007+), MHTML or Word 2003 XML.
After parsing the file, the attribute VBA_Parser.type is a string indicating the file type. It can be either TYPE_OLE, TYPE_OpenXML, TYPE_Word2003_XML or TYPE_MHTML. (constants defined in the olevba module)
The method detect_vba_macros of a VBA_Parser object returns True if VBA macros have been found in the file, False otherwise.
-if vbaparser.detect_vba_macros():
- print 'VBA Macros found'
-else:
- print 'No VBA Macros found'
+if vbaparser.detect_vba_macros():
+ print 'VBA Macros found'
+else:
+ print 'No VBA Macros found'
Note: The detection algorithm looks for streams and storage with specific names in the OLE structure, which works fine for all the supported formats listed above. However, for some formats such as PowerPoint 97-2003, this method will always return False because VBA Macros are stored in a different way which is not yet supported by olevba.
Moreover, if the file contains an embedded document (e.g. an Excel workbook inserted into a Word document), this method may return True if the embedded document contains VBA Macros, even if the main document does not.
Example:
-for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros():
- print '-'*79
- print 'Filename :', filename
- print 'OLE stream :', stream_path
- print 'VBA filename:', vba_filename
- print '- '*39
- print vba_code
+for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros():
+ print '-'*79
+ print 'Filename :', filename
+ print 'OLE stream :', stream_path
+ print 'VBA filename:', vba_filename
+ print '- '*39
+ print vba_code
Alternatively, the VBA_Parser method extract_all_macros returns the same results as a list of tuples.
Since version 0.40, the VBA_Parser class provides simpler methods than VBA_Scanner to analyze all macros contained in a file:
@@ -265,24 +283,24 @@Example:
-results = vbaparser.analyze_macros()
-for kw_type, keyword, description in results:
- print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
+results = vbaparser.analyze_macros()
+for kw_type, keyword, description in results:
+ print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
After calling analyze_macros, the following VBA_Parser attributes also provide the number of items found for each category:
-print 'AutoExec keywords: %d' % vbaparser.nb_autoexec
-print 'Suspicious keywords: %d' % vbaparser.nb_suspicious
-print 'IOCs: %d' % vbaparser.nb_iocs
-print 'Hex obfuscated strings: %d' % vbaparser.nb_hexstrings
-print 'Base64 obfuscated strings: %d' % vbaparser.nb_base64strings
-print 'Dridex obfuscated strings: %d' % vbaparser.nb_dridexstrings
-print 'VBA obfuscated strings: %d' % vbaparser.nb_vbastrings
+print 'AutoExec keywords: %d' % vbaparser.nb_autoexec
+print 'Suspicious keywords: %d' % vbaparser.nb_suspicious
+print 'IOCs: %d' % vbaparser.nb_iocs
+print 'Hex obfuscated strings: %d' % vbaparser.nb_hexstrings
+print 'Base64 obfuscated strings: %d' % vbaparser.nb_base64strings
+print 'Dridex obfuscated strings: %d' % vbaparser.nb_dridexstrings
+print 'VBA obfuscated strings: %d' % vbaparser.nb_vbastrings
The method reveal attempts to deobfuscate the macro source code by replacing all the obfuscated strings by their decoded content. Returns a single string.
Example:
-print vbaparser.reveal()
+print vbaparser.reveal()
After usage, it is better to call the close method of the VBA_Parser object, to make sure the file is closed, especially if your application is parsing many files.
-vbaparser.close()
+vbaparser.close()
The following methods and functions are still functional, but their usage is not recommended since they have been replaced by better solutions.
@@ -297,54 +315,54 @@Example:
-vba_scanner = VBA_Scanner(vba_code)
-results = vba_scanner.scan(include_decoded_strings=True)
-for kw_type, keyword, description in results:
- print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
+vba_scanner = VBA_Scanner(vba_code)
+results = vba_scanner.scan(include_decoded_strings=True)
+for kw_type, keyword, description in results:
+ print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
The function scan_vba is a shortcut for VBA_Scanner(vba_code).scan():
-results = scan_vba(vba_code, include_decoded_strings=True)
-for kw_type, keyword, description in results:
- print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
+results = scan_vba(vba_code, include_decoded_strings=True)
+for kw_type, keyword, description in results:
+ print 'type=%s - keyword=%s - description=%s' % (kw_type, keyword, description)
scan_summary returns a tuple with the number of items found for each category: (autoexec, suspicious, IOCs, hex, base64, dridex).
Deprecated: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
The function detect_autoexec checks if VBA macro code contains specific macro names that will be triggered when the document/workbook is opened, closed, changed, etc.
It returns a list of tuples containing two strings, the detected keyword, and the description of the trigger. (See the malware example above)
Sample usage:
-from oletools.olevba import detect_autoexec
-autoexec_keywords = detect_autoexec(vba_code)
-if autoexec_keywords:
- print 'Auto-executable macro keywords found:'
- for keyword, description in autoexec_keywords:
- print '%s: %s' % (keyword, description)
-else:
- print 'Auto-executable macro keywords: None found'
+from oletools.olevba import detect_autoexec
+autoexec_keywords = detect_autoexec(vba_code)
+if autoexec_keywords:
+ print 'Auto-executable macro keywords found:'
+ for keyword, description in autoexec_keywords:
+ print '%s: %s' % (keyword, description)
+else:
+ print 'Auto-executable macro keywords: None found'
Deprecated: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
The function detect_suspicious checks if VBA macro code contains specific keywords often used by malware to act on the system (create files, run commands or applications, write to the registry, etc).
It returns a list of tuples containing two strings, the detected keyword, and the description of the corresponding malicious behaviour. (See the malware example above)
Sample usage:
-from oletools.olevba import detect_suspicious
-suspicious_keywords = detect_suspicious(vba_code)
-if suspicious_keywords:
- print 'Suspicious VBA keywords found:'
- for keyword, description in suspicious_keywords:
- print '%s: %s' % (keyword, description)
-else:
- print 'Suspicious VBA keywords: None found'
+from oletools.olevba import detect_suspicious
+suspicious_keywords = detect_suspicious(vba_code)
+if suspicious_keywords:
+ print 'Suspicious VBA keywords found:'
+ for keyword, description in suspicious_keywords:
+ print '%s: %s' % (keyword, description)
+else:
+ print 'Suspicious VBA keywords: None found'
Deprecated: It is preferable to use either scan_vba or VBA_Scanner to get all results at once.
The function detect_patterns checks if VBA macro code contains specific patterns of interest, that may be useful for malware analysis and detection (potential Indicators of Compromise): IP addresses, e-mail addresses, URLs, executable file names.
It returns a list of tuples containing two strings, the pattern type, and the extracted value. (See the malware example above)
Sample usage:
-from oletools.olevba import detect_patterns
-patterns = detect_patterns(vba_code)
-if patterns:
- print 'Patterns found:'
- for pattern_type, value in patterns:
- print '%s: %s' % (pattern_type, value)
-else:
- print 'Patterns: None found'
+from oletools.olevba import detect_patterns
+patterns = detect_patterns(vba_code)
+if patterns:
+ print 'Patterns found:'
+ for pattern_type, value in patterns:
+ print '%s: %s' % (pattern_type, value)
+else:
+ print 'Patterns: None found'
When an OLE Package object contains an executable file or script, it is highlighted as such. For example:
To extract an object or file, use the option -s followed by the object number as shown in the table.
Example:
@@ -67,9 +86,9 @@rtf_iter_objects(filename) is an iterator which yields a tuple (index, orig_len, object) providing the index of each hexadecimal stream in the RTF file, and the corresponding decoded object.
Example:
-from oletools import rtfobj
-for index, orig_len, data in rtfobj.rtf_iter_objects("myfile.rtf"):
- print('found object size %d at index %08X' % (len(data), index))
+from oletools import rtfobj
+for index, orig_len, data in rtfobj.rtf_iter_objects("myfile.rtf"):
+ print('found object size %d at index %08X' % (len(data), index))