I read an article on Fireeye's website the other day where they used Machine Learning to eliminate a lot of the noise that comes out of tools like strings. It's pretty interesting and looks like it would save me some time when looking through malware. https://www.fireeye.com/blog/threat-research/2019/05/learning-to-rank-strings-output-for-speedier-malware-analysis.html I wondered how effective freq.py scores would be in helping to eliminate the noise. 45 minutes and 29 lines of Python code later I have something that looks like it works. Check out freq_sort.py. Before freq_sort.py here is the output of strings on a piece of malware: student@573:~/freq$ strings -n 6 malware.exe | head -n 20 !This program cannot be run in DOS mode. e!Rich `.rdata @.data .pdata @.gfids @.rsrc @.reloc \$0u"H L$ SVWH K SVWH |$ H;_ <bt%<xt!<Zt |$ AVH l$ VWAV L$ SUVWH UVWATAUAVAWH 0A_A^A]A\_^] UVWATAUAVAWH @A_A^A]A\_^] After freq_sort.py the useful stings quickly bubble to ...
SRUM_DUMP and SRUM_DUMP_CSV have been ported to Python3 and are available for download from the PYTHON3 branch of my github page. https://github.com/MarkBaggett/srum-dump/tree/python3 In moving to Python3 I also updated the modules that I depend upon to parse and create XLSX files and access the ESE database that contains the SRUM data. I hope that this will fix the issue that some users have experienced with SRUDB.dat files that create very large spreadsheets. If it does not please let me know and continue to use SRUM_DUMP_CSV.EXE to avoid the XLSX problem. In moving to Python3 you will find the process to be faster. If you would like to run the tools from source instructions for doing so are in the README on the github page.