Aug 24 2005

Lies, damn lies and approximations

Tag: Code,ProgrammingAdam Wright @ 6:28 pm

Alas, I’ve not had time to write another article just yet but I’ll get around to it soon. I did, however, compile some statistics about the Decal project that you might find interesting. Everything here is likely an underestimate, as the project grows continuously.

File Type Number of files Number of lines
C++ 120 39600
C++ Header 120 50075
IDL 11 3907
XML 13 3235
Total 264 96817
Developers Greater than 10, over time
Manhours On the order of 103
Market cost Manpower value of over > $100,000.00
Why? Passion

Ah, the power of a committed community.


Aug 19 2005

Reapplying the Decal: The surviving stumble to victory!

Tag: Code,ProgrammingAdam Wright @ 3:24 pm

Well, the total of received answers to the questions in the last part was 2 which, whilst somewhat less than I hoped, just makes it more of an achievement to those that tried. I’m sure lots of you are sitting at home using your solutions as some form of abstract modern art, rather than wasting them on this site, and you’re right to do so!

Ahem. Let’s see what we got.

Bonus question 1What changes would you make to print “Hello Hello”?

This one isn’t too great a leap from what we did in the article – rather than call both our functions in succession, we’ll call the same one twice. The only minor roadblock is working out the offset to the function, which is 013 (extra credit if you just read it from the original source!). Hence the change you wanted (using the original “helloWorld.exe”) was to change byte 0409 to “13”.

Bonus question 2What changes would you make to print “Hello Decal”?

Another short intuitive hop, and one for the adventurous. We reason that if we can change the code, why can’t we change the data? Indeed, “Decal” has exactly the same number of letters as “World”, so one can just use the editor to replace the “World” in the data section (address 0838) with “Decal” and voila – success!

Bonus question 3We know the printing functions were generated from an object in our source C++. Which of the 3 blocks in the given assembly do you think belonged to that object originally?

The 3 given blocks were basically identified as being “The start of the big list”, “Print Hello function” and “Print World Function”. From our past discussion we know that objects “know about” one single concept – in our case, the concept of printing “Hello World”. As such, it’s unlikely that the object contained the original code that makes up the whole big list, and as such the last 2 of the 3 functions are the best candidates for being part of the object. End result, the functions beginning at addresses “00401020” and “00401040” are the right answer.

Anyone who deduced that, full marks! Anyone who read up to Q5 then used that information, also full marks (plus a nod for having excellent exam technique)!

Bonus question 4…Use the nop instruction to stop the program from waiting for enter before exiting, and explain what you did.

More deductive reasoning, the staple of the sport. We know the code prints “Hello”, then prints “World” before finally waiting for a keystroke. We can see in the big list that the call to print “Hello” is followed by the call to print “World” – and both are followed by another call that we didn’t discuss! This being the last call in the program, we can guess with good accuracy that it’s this function performing the wait.

Taking nop, we replace all the bytes of the call and the address it’s calling with the nop instruction code (starting at 040d, replace 5 bytes with “90”) and run the result. Did you see that? No more waiting! Fantastic!

Advanced bonus question 5…Produce a single C++ file that will compile to the given “helloWorld.exe” program.

Well, I can’t do too much explaining on this one – you’d have to have a solid understanding of C++ to even try it. The overall concept is that one sets up some source code using the hints in question 5 and populates it with the simplest code that performs the job. Then, by analysing the original “helloWorld.exe” assembly, you alter your code to use the external objects and functions that I used. Finally, a combination of binary comparison and detailed examination of both assembly dumps can be used to “tweak” your code to match.

The winner by a country mile is Portaren, who’s question 5 answer even caught the curve ball of a 32 byte buffer when a 1 byte would have been more logical. Ignoring formatting and minor (basically stylistic) differences, his code is so perfectly identical as to warrant full marks. Portaren, leave a comment or drop me an email if you want to pick your prize! An honourable mention also goes to “Long Duk Bill” for being the only other person to submit a solution. Everyone else who didn’t send anything or who’s just curious can now download the original project files and source from here (GPG signature file here). I’ve bundled in the compiled libctiny I used, but you can make your own from the original sources at this site.

Before we finish this series, I’d like to thank everyone for playing – despite growing more complex in nature, I hope what we went through was instructive and interesting. I’d love to have your comments on the style in general as well as what could be improved as I’d imagine I’ll write at least one more Decal focused article. Whilst I have the time, I also plan to start a series on the basic reasons for the existence of numbers (take that, newly increased visitor count!), as well as finishing my previous run on software design at some point.

Vade in pace,

Adam Wright (as Asriel)


Aug 15 2005

Reapplying the Decal: Answers coming soon

Tag: Code,ProgrammingAdam Wright @ 11:16 pm

As I’ve only had one answer (albeit an excellent one) to the previous set of questions, I’m going to leave posting my solutions for a little while longer. Note that I have no expectations of everyone doing all 5 (or even most people doing any) but if you want to understand the topic, a hands on approach is pretty much mandatory.


Aug 13 2005

Reapplying the Decal: The practical application of theory

Tag: Code,ProgrammingAdam Wright @ 4:14 pm

We’re on the home stretch now – anyone who’s survived this far is a brave soul indeed and those who make it through today shall be hailed as kings among men (by me anyway). If you’re just joining us, please read the previous articles before you start. Finally, before we begin, make sure you’ve got this archive extracted, and make a copy of the “helloWorld.exe” file called “worldHello.exe”. We’ll be rewriting the program today and we wouldn’t don’t want to break the original in our experiments!

As promised, we’ll start with the actual x86 assembly “big list” of the program, slightly trimmed to show only the information we’re need (If you want the entire assembly listing, here it is). Again, don’t be scared – it’s actually quite easy to understand. You might want to print it out before reading on.

00401000 55 push ebp
00401001 8B EC mov ebp, esp
00401003 E8 18 00 00 00 call 00401020
00401008 E8 33 00 00 00 call 00401040
0040100D E8 4E 00 00 00 call 00401060
00401012 33 C0 xor eax, eax
00401014 5D pop ebp
00401015 C3 ret
00401020 55 push ebp
00401021 8B EC mov ebp, esp
00401023 68 30 20 40 00 push 402030h
00401028 E8 58 00 00 00 call 00401085
0040102D 83 C4 04 add esp, 4
00401030 5D pop ebp
00401031 C3 ret
00401040 55 push ebp
00401041 8B EC mov ebp, esp
00401043 68 38 20 40 00 push 402038h
00401048 E8 38 00 00 00 call 00401085
0040104D 83 C4 04 add esp, 4
00401050 5D pop ebp
00401051 C3 ret

Let’s make some sense of this. Just like in your hex editor, the first column is the address that the line being shown corresponds to in the file. “But”, I hear your ask, “00401000 is way bigger than the size of the file!”. Exceptional observation indeed, if that was you I heard – The number is actually the “base address”, which is pretty much where the program will end up in memory. Don’t worry about this, as today I’ll translate each of these numbers into the file offset for us.

The second part shows the actual representation of the instructions. This is a chunk of the data from the file at the given address. Let’s check this – load “worldHello.exe” in your hex editor, and scroll down until you find the closest row to 0400 (which corresponds to 00401000). As we read along the line we should start seeing the data shown within our assembly listing – and yes indeed, after a long run of “00” up pops “55 8b ec…”, just as expected. Fantastic!

The final 2 columns are exactly like the “program” we ran yesterday – First comes the assembly instruction, then the associated parameters. This is the bit we’re most interested in, as it gives the human readable portion of the assembly (something we can actually understand). Both of these last two columns were created from the data in the second column by the tool used to “disassemble” the “helloWorld.exe” program into the big list.

Now we’ve got our big list, and have some idea about how to read it, let’s start the process of changing it! First, we need to roughly work out what’s going on, based on our new understanding of assembly, so let’s “run” this program in our heads just like yesterday, starting from the top.

  1. push ebp – Well, we’re not sure what an ebp is, but it must refer to some memory somewhere, as it’s getting pushed onto the stack. Let’s ignore it for now, and hope it doesn’t matter too much.
  2. mov ebp, esp – “mov”? We’ve not seen that before, but all it does is move some memory from one block to another, without it going onto the stack. We can ignore this as well.
  3. call 00401020 – Aha, a call! We know what this does. It’s going to go run the instructions located at 00401020
  4. call 00401040 – Hmm, another call straight afterwards, going to a function that’s near the first one.

Woah, let’s stop for a second now and think about what we’ve found. We know we’ve got a program that prints “Hello” and “World” to the screen. We know we want to change that to print “World” and “Hello” instead. What if the printing functions for each word were separate? In this case, we’d expect them to be called straight after one another – which is exactly the behaviour we’ve just noticed in our assembly. If we can see that both the called functions are similar, it’s likely that we’ve found our printing functions.

So, off to the assembly again, and let’s look at the function under address 00401020.


push ebp
mov ebp,esp
push 402030h
call 00401085
add esp, 4
pop ebp
ret

Well, that looks interesting – it’s moving memory around just like the beginning of our big list, but then it’s pushing something right before it’s calling something. Looking over the function at 00401040, it’s virtually identical – the only difference is the memory it’s pushing before it makes the call! Our hypothesis is looking stronger and stronger. What if the function at 00401085 being used in these functions actually does the printing? If so, we’d expect the memory pushed onto the stack to be the data the function needs to do it’s job – i.e. something to print (this is just like yesterday, when the “add” function needed the numbers to add on the stack). The first of our functions is pushing data at 402030 onto the stack, so everyone eagerly go look at address 0830 in the file (0830 corresponds the 402030 address).

[Sound of people tabbing to and from hex editor]

Yes! Amazing! At offset 0830, our file does indeed say “Hello”! We’ve almost certainly found the 2 functions we want, and we’ve seen where they’re used. Going back to the big list, to the site of the 2 calls (00401003), we need to find some method of achieving our goal. As we want to make the program say “World Hello”, and we think we’ve found the two functions that print “Hello” and “World”, the easiest thing to do will be to just swap the two calls around.

The big list shows us that the data for the first call is “E8 18 00 00 00” and the second is “E8 33 00 00 00” – scroll into your file (offset 0403) and find these two blocks of data, one after the other. If you click on the “18”, you’ll be able to type in a new number – enter “33″. Change the “33” in the second block to “18” in a similar vein. Fantastic – we’ve effectively swaped over the two blocks of data, so we should have reversed the calls! Save the file, and let’s give our new program a trial run!

worldHello.exe failing to run

Oh dear, that was impressively destructive. I hope you didn’t opt to send an error report, because we’re going to fix this ourselves!

We’ve two options for finding out what went wrong. We could attach a debugger (a special program that watches the CPU as it executes each instruction, and tells us what the state of the system) – this would certainly help us track down the problem, and if you start doing this on your own you can expect to spend a lot of time inside one. The other option is for me to tell you what happened. Yes, I suppose I could have rigged this so that we choose the right option straight away, but everyone loves an explosion and it serves to show you what happens when you get things wrong!

To expedite things, I’ll explain what mistake we made. We know the “call” instructions that got modified take a parameter of the address they want to call to – We can see that in the assembly listing. The problem arises in that the assembly listing is too smart for it’s own good – it turns out there are lots of different “call” instructions, all behaving slightly differently but doing the same basic job. A single word is not expressive enough to display this data for us; hence the big list shows all “call” instructions as equal. The only way we’ll ferret out exactly what’s happening is to look at that intimidating data in the 2nd column, and understand what it means.

In our case, the call instruction corresponds to the “E8” on the far left of the hex in the second column. This is the instruction number – the actual data your CPU will use to decide where to send the instruction. Intel has kindly provided us with a big manual of all these numbers, and what they mean, which tells us that “E8″ actually refers to “Call near, displacement relative to next instruction”. So, the parameter to our call isn’t actually an address, it’s how far away from us the call should go. It seems we can’t always trust the assembly given to us by tools – it’s being “helpful” and calculating where the call is going for us, rather than showing us what’s strictly happening.

This means we need to work out the “relative offsets” – how far away from each call instruction the function we want to call is. This isn’t too hard because we’re swapping one call instruction 5 bytes forward (the number of bytes of data for the instruction, if you add them up), and one it going 5 bytes backwards. So, what’s was 33 becomes 38 (5 bytes “further away” to the call) and what was 18 because 13 (5 bytes “closer”). So, let’s make this change. The new data for the first call becomes “e8 38” – change the byte just as we did before. The second changes to “e8 13”, make the change in the same way. Save the result, cross fingers and toes and run the program.

worldHello.exe running with modifications!

Being English, this is the point where I quietly nod my head, shake a hand or two and ponder the meaning of life. Other cultures might want to apply different celebratory techniques, because we’ve achieved our goal! We battled through a remarkably complex set of concepts, and emerged victorious and educated – I feel like I should make certificates. I hope everything was reasonably easy to understand, but these are hard ideas to wrap your head around. If you’re stuck anywhere, leave a comment and I’ll see what I can do to help. Remember: there’s never any shame in asking for help, as long as you ask politely and have tried to understand things from the given material.

So, that about wraps up this 4 part series introducing the “memlocs” process. What’s done for Decal is pretty similar to this, albeit much more involved. We don’t “change” the client to test the functionality we want, but we do inject entirely new functions that will call the functions we find when AC is running. If you enjoyed doing this small exercise, then it’s quite possible you’ll be able (with work and research) to find “memlocs” yourself – I’ll post something later about where you can carry on learning this topic in more detail. Even if you hated every moment of it, at least you’ve got a better idea of the general process. My next article will be a less detailed, more accessible view of another part of the Decal update process, so everyone I alienated with this with can come back and join us.

Oh, wait a second – I can’t forget the promised super bonus questions! There are no guaranteed riches or fame, but I can give you mostly anything you like from my L60 character on Thistledown (if AC is your thing) or my L60 Human Mage on Argent Dawn EU (if you’re a Warcraft player). Hold onto your hats, and give these a try!

Bonus question 1: What changes would you make to print “Hello Hello”?

Bonus question 2: What changes would you make to print “Hello Decal”?

Bonus question 3: We know the printing functions were generated from an object in our source C++. Which of the 3 blocks in the given assembly do you think belonged to that object originally?

Bonus question 4: There’s another instruction that your processor understands: “nop”, or “No operation”. It takes no parameters, has instruction number “90” and does absolutely nothing (the CPU skips right over it). Use this instruction to stop the program from waiting for enter before exiting, and explain what you did.

Bonus question 5 (Advanced Extra Credit): The C++ source code consisted of a single object, “HelloWorldPrinter” which had 3 functions “printHello”, “printWorld” and “waitForReturn”. When compiling, I turned off standard library linking, and used libctiny to reduce the size of the result. All compiler optimisations were off, and no debug symbols were generated. Use this and anything else you can glean, produce a single CPP file that will compile to the given program. Ignoring formatting, the closest to my source code wins!

Answers in the comments please and yes, this does mean could in theory copy someone else but as I’ll only take the first “correct” answer for each question, this shouldn’t be a big deal. Results of the quiz (if I get any replies) will appear in a couple of days, along with the project source.


Aug 12 2005

Reapplying the Decal: Know your materials

Tag: Code,ProgrammingAdam Wright @ 8:00 pm

Now we know how the program exists on your computer, we’ll learn a little bit about how your CPU actually runs the instructions once they’re loaded. As usual, please make sure you’re up to date on the previous articles; It will be very hard to understand if you don’t. Also, if there’s something that’s not clear or you spot a mistake, please leave a comment – I don’t knowingly leave errors in, and would like everyone who’s playing along to get the most out of this series. Finally, note that will be the last “pure theory” article, so we can get back to the interesting Decal specific stuff. Let’s get to it!

Here’s a secret that will make our lives a lot easier. The big instruction list is actually subdivided! Inside the big list is actually lots of little lists, groups of instructions that are stored together. We call these groups “functions”, as they do just one thing. Rather than include each group everywhere it’s used, the compiler will only include the group once. After that, when it needs the “functionality” that the “function” provides, it will tell the CPU to stop executing the big list, and to start executing one of these groups. When the CPU is finished, it will go back to where it was in the list.

This is very useful to us, as the compiler will have created these functions for out of our original objects. We know our program prints “Hello” and “World”, so the odds are good that there are functions specifically for printing “Hello” and “World” inside the list. In fact, there is almost certainly a function for “Printing something”, and then our two functions which use the “Print something” function themselves.

Enough waffle – let’s role-play! Today, we’re going to be CPUs! Let’s set ourselves up with some memory, and then we’ll try and execute an instruction list.

State of memory after execution: Phase 1

I’ve put 3 things inside our memory. Coming first, we guess that “Add” is a function – it’s a list of instructions that can add two numbers together. Remember, as a function, we don’t have to copy “add” into the big list, we can just tell ourselves to execute it and watch what happens! We’ve also got 2 boxes of data – the number “2” and the number “5”. Above each box is the “memory address” of that box. This tells us where in memory that box is stored, and we’ll use it to uniquely identify the box in our instructions.

The scary tower on the right is the “Stack”. This is a sort of temporary memory that we’ll use when we’re executing our program. We can “push” data onto the stack, and it will go into the topmost free slot. We can also “pop” the stack, and the topmost item will come back into our memory.

So, without further ado, let’s run the following assembly “big list”, specially designed for our “Weblog Reader” processor.

push 02
push 03
call 01
pop 02

OK, jot that down on a piece of paper and put your finger on the first line. Your finger is the “execution cursor”. Every time we finish an instruction, move it down a line. Notice how each instruction is very simple, consisting of just one operation and that they take extra information in the form of “parameters” – basically additional information for the instruction to do it’s work. Let’s begin, and it should all become clear!

  1. push 02 – We need to take the box labelled 02, and push it onto the stack. Simple enough, and let’s see what we now look like.

    State of memory after execution: Phase 2

  2. push 03 – Basically the same again, except from a different address. Notice how the stack “builds up” as we push things onto it; it’s like a giant pile of plates!

    State of memory after execution: Phase 3

  3. call 01 – Hmm. We need to “call” the add function. It’s going to take all the data on the stack, add it together, and put the result on the stack for us. Great – We don’t need to do any work ourselves, our function is doing it for us.

    State of memory after execution: Phase 4

  4. pop 02 – Aha! We can now take the result we received from the stack, and put it back into box 02!

    State of memory after execution: Phase 5

Superb. In just 4 instructions, we’ve successfully executed our program, and worked out that 2 + 5 = 7. Our memory now reflects the result, our user is happy, and we’ve shown that Intel and AMD have nothing on us!

Back on earth, we’ve just played the part of a “Stack machine processor”. Using this technique, we can run basically every program we can think of! The slight dampener is that your CPU is not strictly the same, but don’t panic – what we’re dealing with will be close enough to finish our task in part 4, where we’ll make “Hello World” into “World Hello”!

So next time we’ll start by disassembling the helloWorld.exe file into a list of instructions similar to those we just executed. We’ll work out roughly what the instructions in the list do, find the functions and then determine what we need to edit in order to change the output. Then, with one fell swoop, we’ll modify Hello world to do our bidding! Mmmmm, we’re so close, one can almost taste it.

Oh, and because the last “advanced” bonus question was answered by Enolive with almost precognisant alacrity, here’s another: Can you tell me what alternative to stack based processing an x86 CPU uses, and name 2 virtual machines with widespread deployment that are purely stack based (before their code is JITed)?


Next Page »