things i should not forget, and that, eventually, could interest people

f-strings, new RVT tool

leave a comment »

f-strings, or Forensic Strings, is a new RVT tool that will be incorporated soon to the search engine of RVT.

You know what binutils’ strings command do: extract printable characters from a binary file.  Although it supports various character sets (plain ascii, utf8, utf16, in little and big endian), only support one each time you execute it. And it is not very good with mixtures of character sets on the same file.

f-strings extracts sequences of all that seems a printable character out of a binary file, written in plain ASCII, utf8 or utf16 (little endian only). That means that usually will extract more noise than binutils’ strings, but only one execution is needed.

Moreover, f-strings translates special characters to their plain ASCII equivalents. For example, f-strings translates ‘á’, ‘Á’, ‘à’, ‘À’, ‘ä’, etc., to ‘a’. Also translates spanish and catalan special characters ( ‘ñ’ and ‘ç’  to  ‘n’ and  ‘c’).

Finally, it lowercases all the output.

For example:

$ cat accentuated.txt
el sinvergüenza de José es un ñoño y es del barça

$ ./f-strings accentuated.txt
el sinverguenza de jose es un nono y es del barca

f-strings is open source (GNU/GPL v2.0), and can be downloaded from the Revealer Toolkit web page (here).

Here you have the f-strings’ help as printed with ‘f-strings -h‘:

Revealer Tools, forensic strings, 09-2009
USAGE: f-strings [-t] [-n <number> ] [-f]

-t Print the location of the string in base 10
-n Locate & print any sequence of printable characters
of at least characters (default 4)
-h Display this information

f-strings get a file and prints at stdout all printable characters
like binutils' strings function, BUT:
- convert all that 'seems' latin1, UTF-8 and UTF-16, little endian
to plain ASCII
- translates some special characters. For example, accented a's are translated
to the ASCII character 'a'. All vowels, plus spanish and catalan special
characters are translated
- lowercases all printable characters

known issues:
- only one file at each execution
- offset is printed only in base 10, so argument -t do not
accept subarguments, and '-t' is equivalent to '-t d' of
binutils' strings
- \x00 characters are ignored, so in a hard disk full of zeros
with 'Hey ' at the begining and 'Ho' at the end, f-strings will
extract the string 'Hey Ho'

more information at http://code.google.com/p/revealertoolkit/

Enjoy and feedback!


Written by dervitx

6 September 2009 at 13:25

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: