Fensterbrief - a python script for working with letters based on LaTeX and Markdown

January 22, 2017 at 02:10 AM | categories: latex, tools

I use LaTeX for more than a decade. Naturally it is my preferred tool suite when it comes to writing letters. Since working with LaTeX and doing office work requires a bit book keeping, I started developing a helper tool to work with LaTeX letters and corresponding templates.

I think everything started when I had to write a letter a long time ago and instead of just writing the letter directly, I implemented a CGI script that asked for metadata and then rendered everything into a PDF file using the LaTeX code as intermediate format. There are several similar scripted approaches, you can find on the Internet. Meanwhile, for me it is more natural to have LaTeX letter files on my filesystem and when I write a new letter, I may reuse an old one, especially to reuse the recipient address I manually inserted into a letter to the same recipient before. So, I keep the LaTeX letter files for a later reuse and the PDF output to have a printable copy and I have several unnecessary artifact files, which usually remain in the filesystem, too. My letters are organized in a directory structure: each directory corresponds to a business process, which stores one or more documents related to it. The files and folder have a fixed naming scheme, which includes a date in ISO date notation (YYYY-MM-DD) for letters and a YYYY-MM format for folder names to support a file name based lexicographical order.

For a long time, I manually organized these files and directories. Recently, I started automating the process and wrote a python script that implements it. The idea is to have a tool that enables me writing letters similar fast as writing e-mails, in other words in a fire and forget style. Hence, I recently added the support for Markdown and support for the lookup of postal addresses via the Google API.

The tool is called Fensterbrief and it supports these features:

  • intended to be used via command line
  • maintaining a folder and document structure
  • support for LaTeX and Markdown based letters
  • support for fax transmissions via simple-fax.de
  • support for buying postage for the Deutsche Post
  • lookup postal addresses via the Google API

The source code, further documentation, and usage examples are on Github.

Brute-forcing CRC parameters

February 10, 2012 at 11:20 AM | categories: reverse engineering, tools

Recently, I had to discover the parameters for a cycling redundancy check (CRC) calculation. Because it was the second time within six months that there was a need for determining a CRC polynomial, I implemented a brute-forcing tool. This tool reveals the (truncated) polynomial, the initial value, input and output reflection and a finally applied XOR-mask. Obviously, brute-forcing fails for large polynomials like for a CRC-32, because you have to try different polynomials and initial values. But for smaller CRCs this approach works well. Usually, you do not have truncated polynomials larger than 16, especially not in simple telegrams with some bytes.

What is a truncated polynomial? In general, a polynomial has a grade. E.g., CRC-16-CCITT has a polynomial of grade 16. This is x^16 + x^12+x^5+1. To describe the polynomial by its coefficients the leading digit is often skipped, e.g., 1.0001.0000.0010.0001 becomes 1.0000.0010.0001. Thus, the term truncated polynomial is used to indicate this truncation.

Let’s see how this tool works. In a first step, we generate some test data. Therefore, the tool generate-test-data might be used. It generates some random bit strings and calculates a CRC, which is appended to the string.

$ ./generate-test-data --final-xor 0 --messages 10 | tee testmessage_1
# width                    : 10 bits
# CRC's offset             : 49
# calc CRC for bit offsets : 0 .. 49 (not included)
# final XOR                : 0
# reflect in               : true
# reflect out              : false
# truncated polynom        : 189 (MSB not shown)
# initial value            : 149

0110110010000011101000011000110101111000000000100 1111111010
0001000000001100101100100110011011111100000110100 0101000101
1101011100111000110110110010111011110110101001001 0011100111
0100000011100011110000100111010001101100010010011 0110100011
0110110000010111010110011001010011100101110000101 1100000001
0110101000000111010111101101001000110010100111110 1011001001
1000101011110110010100001110001111101110000011000 1000010010
0110111010111000011100100111110101110101111011010 1001111001
1001101001111000010110001000011101110000111001001 0101110000
1101110110101001101001110010011101010001110000110 0000110111

Here, the bit string starts at offset 0 and ends at offset 49. Please note, that these are not the common eight bits per byte. I avoided this byte-centric view, because in low-level reverse engineering messages might be odd-formed, especially when analyzing radio transmission.

In a second step, these test messages are reformatted to fit the brute-forcing tool’s input format. Therefore, I wrote a script in Perl. It reads bit strings (“101001…”), hex strings (“0xdeadbeef…”) and comma-separated lists of hex values (“0x23, 0x42, 0x5”) and reformats them to either a hex or a bit string.

$ perl ../rewrite-as.pl bits testmessage_1 | tee testmessage_1.bits

Finally, we launch the CRC brute-forcing tool, which requires specifying several offsets: The tool needs to know, where the message’s CRC relevant portion starts and stops and where the checksum is placed. When reverse engineering a telegram format, one might determine the checksum part without problems. This is caused by the CRC algorithm’s property that it is a Galois type shift register, which produces pseudo-random numbers. But it loops not on its own. Instead the algorithm is fed with message bits. Thus, the checksum looks more random than the rest of the message.

$ ./bruteforce-crc --width 10 --start 0 --end 49 --offs-crc 49 \
       --file testmessage_1.bits --probe-reflections true 
number of threads        : 4
width                    : 10 bits
CRC's offset             : 49
calc CRC for bit offsets : 0 .. 49 (not included)
final XOR                : 0
reflect in               : false
reflect out              : false

truncated polynom        : from 0 to 1023 (MSB not shown)
initial value            : from 0 to 1023
probe reflections        : true
probe final xor          : false

extracted crc 03fa
extracted crc 0145
extracted crc 00e7
extracted crc 01a3
extracted crc 0301
extracted crc 02c9
extracted crc 0212
extracted crc 0279
extracted crc 0170
extracted crc 0037
----------------------[ MATCH ]--------------------------------
Found a model for the CRC calculation:
Truncated polynom : 0xbd (189)
Initial value     : 0x95 (149)
Final XOR         : 0x0 (0)
Reflected input   : false
Reflected output  : false
Message offset    : from bit 0 .. 49 (end not included)

The brute-forcing tool extracts the checksum field at the specified offset and compares the calculated CRCs with the extracted CRCs. If a CRC matches the extracted one, the tool continues with testing the other messages. And finally, if a CRC parameter set satisfies all messages a CRC model is found.

If you have to reverse engineer an undocumented bus-protocol or some radio messages and don’t want to reconstruct the CRC parameters by putting on one's thinking cap, you might need such a tool, too. Therefore, I published the code on github.com.

Further readings:

Radare and the GCHQ's brainteaser

December 02, 2011 at 10:15 PM | categories: reverse engineering, tools

Recently I discovered the tool suite radare, which is a handy tool for analyzing snippets of binary code. You can just open the file containing shell code and disassemble it. Radare works with PE files and ELF binaries. Further, the software supports several machine types like ARM and Intel and allows analyzing code for Dalvik and Java virtual machines.

When I stumbled upon the GCHQs reverse engineering challenge, I noticed the 0x42424242, the 0x41414141 and the not that obvious “deadbeef”. I supposed that it is not a byte wise XOR with a “secret” key, at least not on the full message. The most specific hint for me to inspect the code with radare is the message’s first byte, which is a 0xeb. This is also the first byte on my hard disk and there it represents a jump to the boot loader.

The secret message is given as:

char bytes[] = {
0xeb, 0x04, 0xaf, 0xc2, 0xbf, 0xa3, 0x81, 0xec,
0x00, 0x01, 0x00, 0x00, 0x31, 0xc9, 0x88, 0x0c,
0x0c, 0xfe, 0xc1, 0x75, 0xf9, 0x31, 0xc0, 0xba,
0xef, 0xbe, 0xad, 0xde, 0x02, 0x04, 0x0c, 0x00,
0xd0, 0xc1, 0xca, 0x08, 0x8a, 0x1c, 0x0c, 0x8a,
0x3c, 0x04, 0x88, 0x1c, 0x04, 0x88, 0x3c, 0x0c,
0xfe, 0xc1, 0x75, 0xe8, 0xe9, 0x5c, 0x00, 0x00,
0x00, 0x89, 0xe3, 0x81, 0xc3, 0x04, 0x00, 0x00,
0x00, 0x5c, 0x58, 0x3d, 0x41, 0x41, 0x41, 0x41,
0x75, 0x43, 0x58, 0x3d, 0x42, 0x42, 0x42, 0x42,
0x75, 0x3b, 0x5a, 0x89, 0xd1, 0x89, 0xe6, 0x89,
0xdf, 0x29, 0xcf, 0xf3, 0xa4, 0x89, 0xde, 0x89,
0xd1, 0x89, 0xdf, 0x29, 0xcf, 0x31, 0xc0, 0x31,
0xdb, 0x31, 0xd2, 0xfe, 0xc0, 0x02, 0x1c, 0x06,
0x8a, 0x14, 0x06, 0x8a, 0x34, 0x1e, 0x88, 0x34,
0x06, 0x88, 0x14, 0x1e, 0x00, 0xf2, 0x30, 0xf6,
0x8a, 0x1c, 0x16, 0x8a, 0x17, 0x30, 0xda, 0x88,
0x17, 0x47, 0x49, 0x75, 0xde, 0x31, 0xdb, 0x89,
0xd8, 0xfe, 0xc0, 0xcd, 0x80, 0x90, 0x90, 0xe8,
0x9d, 0xff, 0xff, 0xff, 0x41, 0x41, 0x41, 0x41

I just tried to decode it with radare and here it is:

[0x00000000]> eval asm.arch=intel32                                                                                                                                                    
[0x00000000]> pd
.=< 0x00000000, 0 cursor: eb04 jmp 0x6 ; 1 = 0x00000006
| 0x00000002 0 af scasd
| 0x00000003 0 c2bfa3 ret 0xa3bf
| 0x00000003 0 ; ------------------------------------
`-> 0x00000006 256_ 81ec00010000 sub esp, 0x100
0x0000000c, 256 31c9 xor ecx, ecx
.--> 0x0000000e 256 880c0c mov [esp+ecx], cl
| 0x00000011 256 fec1 inc cl
`==< 0x00000013 256 75f9 jnz 0xe ; 2 = 0x0000000e
0x00000015 256 31c0 xor eax, eax
0x00000017 256 baefbeadde mov edx, 0xdeadbeef ; (0xffffffffdeadbeef)
.---> 0x0000001c, 256 02040c add al, [esp+ecx]
| 0x0000001f 256 00d0 add al, dl
| 0x00000021 256 c1ca08 ror edx, 0x8
| 0x00000024, 256 8a1c0c mov bl, [esp+ecx]
| 0x00000027 256 8a3c04 mov bh, [esp+eax]
| 0x0000002a 256 881c04 mov [esp+eax], bl
| 0x0000002d 256 883c0c mov [esp+ecx], bh
| 0x00000030, 256 fec1 inc cl
`===< 0x00000032 256 75e8 jnz 0x1c ; 3 = 0x0000001c
.====< 0x00000034, 256 e95c000000 jmp 0x95 ; 4 = 0x00000095
| 0x00000039 256 89e3 mov ebx, esp
| 0x0000003b 256 81c304000000 add ebx, 0x4
| 0x00000041 248_ 5c pop esp
| 0x00000042 256_ 58 pop eax ; cursor+0x8
| 0x00000043 256 3d41414141 cmp eax, 0x41414141
.=====< 0x00000048, 256 7543 jnz 0x8d ; 5 = 0x0000008d
|| 0x0000004a 264_ 58 pop eax ; cursor+0x8
|| 0x0000004b 264 3d42424242 cmp eax, 0x42424242
.======< 0x00000050, 264 753b jnz 0x8d ; 6 = 0x0000008d
||| 0x00000052 256_ 5a pop edx
||| 0x00000053 256 89d1 mov ecx, edx
||| 0x00000055 256 89e6 mov esi, esp
||| 0x00000057 256 89df mov edi, ebx
||| 0x00000059 256 29cf sub edi, ecx
||| 0x0000005b 256 f3a4 rep movsb
||| 0x0000005d 256 89de mov esi, ebx
||| 0x0000005f 256 89d1 mov ecx, edx
||| 0x00000061 256 89df mov edi, ebx
||| 0x00000063 256 29cf sub edi, ecx
||| 0x00000065 256 31c0 xor eax, eax
||| 0x00000067 256 31db xor ebx, ebx
||| 0x00000069 256 31d2 xor edx, edx
.-------> 0x0000006b 256 fec0 inc al
|||| 0x0000006d 256 021c06 add bl, [esi+eax]
|||| 0x00000070, 256 8a1406 mov dl, [esi+eax]
|||| 0x00000073 256 8a341e mov dh, [esi+ebx]
|||| 0x00000076 256 883406 mov [esi+eax], dh
|||| 0x00000079 256 88141e mov [esi+ebx], dl
|||| 0x0000007c, 256 00f2 add dl, dh
|||| 0x0000007e 256 30f6 xor dh, dh
|||| 0x00000080, 256 8a1c16 mov bl, [esi+edx]
|||| 0x00000083 256 8a17 mov dl, [edi]
|||| 0x00000085 256 30da xor dl, bl
|||| 0x00000087 256 8817 mov [edi], dl
|||| 0x00000089 256 47 inc edi
|||| 0x0000008a 256 49 dec ecx
`=======< 0x0000008b 256 75de jnz 0x6b ; 7 = 0x0000006b
``-----> 0x0000008d 256 31db xor ebx, ebx
| 0x0000008f 256 89d8 mov eax, ebx
| 0x00000091 256 fec0 inc al
| 0x00000093 256 cd80 int 0x80
`----> 0x00000095 256 90 nop
0x00000096 256 90 nop
0x00000097 256 e89dffffff call 0x39 ; 8 = 0x00000039
0x0000009c, 256 41 inc ecx
0x0000009d 256 41 inc ecx
0x0000009e 256 41 inc ecx
0x0000009f 256 41 inc ecx

Update: Heise has details on solving the whole puzzle.