SimTower Reverse Engineering – Part 2

With the header sorta figured out, and unit data partially figured out, there’s still a lot more file that hasn’t been determined yet.

Immediately after the last floor’s unit information, the game does a 4 Byte read, followed by some number of 16 Byte entries. As I quickly suspected, the number of entries is exactly the same as those 4 Bytes interpreted as an integer. Perfect, now we at least know the structure of another large chunk of the file.

Next Parts

But what do we do with this? First I started by dumping the values into a terminal to see if I could see any patterns. Two things jumped out at me, the first and third values increment. The first goes up from 0 to 113, which is the count of floors in the building that have anything on them (including the cathedral’s multiple stories above floor 100). So we’ve got one entry per floor, it seems. Some floors seem to have no entries, which appear to be lobbies or otherwise completely empty floors.

Next, I started looking for a pattern in the number of entries on a floor and what’s on that floor. Quickly, I saw that there were 30 entries for floors with 10 condo units on them. This suggests a condo has 3 entries. Knowing the game, I know that a condo has a population of 3 people, so these must be entries related to people!

Checking a floor with 19 offices there are 114 entries. Offices have 6 people per office, so that’s what this data structure is for. On to the contents of the data structure, past the two that were immediately apparent. The next pattern I spot in that the second byte is also incrementing, and seems to be the index of the unit on the floor. Now we’ve got 3 / 16 Bytes figured out. What’s next?

I name a couple people and use the in game tools to find them throughout the building, and see that byte 7 seems to be the current floor they’re on, and byte 5 looks suspiciously like bit-flags, even if I don’t know what exactly they mean yet. They may show things like if a person is in the building or not, sleeping, etc. I think the last four bytes are two 16 bit integers, and these may be storing the stress and eval(uation), but these don’t look consistent.

What’s After People?

I decide to put the people data aside and see what’s next. The ProcMon CSV (explained in Part 1) shows a read of 9,216 Bytes, with no read for length before it. This suggests to me that this is a static sized block. It’s divisible by 16, but 576 isn’t a nice “round” number, not like 512. Seeing that this is close to 512, I try the next larger even number. 9,216 / 18 is 512. There’s our nice round number.

From this, I can strongly infer that I’m dealing with a fixed length of 512 entries, each 18 bytes in size. Maybe they have a similar structure to our unit structure, which also has an 18 byte size. I can also see that not all of them are full from a hex editor. Where else do I have a similar number? I know the count of commercial (shop, restaurant, fast-food place, etc.) is 419. I’m betting that I’ll have 419 complete values, and the rest as empty placeholders/padding. Let’s see. I’ve included the some entries in the section, with their decimal values shown.

0: [40, 0, 2, 14, 50, 25, 11, 24, 35, 21, 50, 0, 220, 255, 0, 0, 46, 0]
127: [56, 1, 2, 30, 50, 25, 8, 27, 35, 18, 50, 3, 220, 255, 0, 0, 48, 0]
255: [71, 1, 1, 27, 50, 10, 6, 29, 35, 5, 30, 4, 220, 255, 0, 0, 29, 0]
383: [42, 8, 1, 3, 30, 10, 12, 13, 25, 3, 0, 9, 230, 255, 0, 0, 13, 0]
511: [255, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

One of the things I do with values is stick them in a dictionary to see what sorts of results I get, and how many.

# commercial_list is a list of "unparsed" commercial objects, which is just the list of values above.
values = [defaultdict(int) for _ in range(18)]
for comm in commercial_list:
#    if comm.values[0] == 255:  # uncomment to skip empty floors
#        continue
    for i in range(18):
        values[i][comm.values[i]] += 1
# And then a quick and dirty print to see the results.
for i, e in enumerate(values):
    print(f"{i} ({len(e)}):")
    print(', '.join([f"{k}: {v}" for k, v in e.items()]))
# Or sorted with values only, and not counts. Equivalent to using a set().
    print(', '.join([str(k) for k in sorted(e.keys())]))

I’m not going to include the full output but I can quickly see a few things. Most evident is that value for the first byte the values range from 1 to around 100, which suggests that this is a floor number. Except several entries are FF (=255), and these are otherwise empty. How many non-empty values do we have? 419! Yep, this is additional metadata related to commercial tenants, separate from the data in the Units data. Likely this stores their profits, eval, etc. Given that we know the floor now, lets see what else we can find out.

I can also see that the 15th, 16th and 18th (last) bytes are 0, at least in this test tower. I see that byte 3 only has the values 0, 1, 2 and 3. Knowing other things about the game helps me look for other values. I know there are 5 fast food places. 5 restaurants and 11 different shops. None of the bytes has 21 different entries, but the 12th byte has 11 entries, ranging from 0 to 10. Perhaps this is the index of which variant, which is supported by there being many more entries with 0-4, than than other 6 values, which are nearly identical. So I think this byte is the unit’s variant.

How do I check this easily? I used my thumbnail generation code from before and made thethe unit look up its entry in the commercial data section by its index, and then use the variant to choose a colour. In this case, I generated 11 different HSV colours, converted to RGB and used those. From this, I can see indeed this is the variant, and from looking at the game, I can see what each value refers to.

But something is wrong. That red block on the bottom left? Those three aren’t all the same, so something is going on here. This looks overall correct, but it seems that order things were built in matters. There doesn’t look like pointers back to the in the Commercial Data Section, and there isn’t anything in the Unit data structure. But there’s that 188 Bytes for each floor. I divide 188 by 4, but 47 isn’t a number that means much in SimTower. 188 by 2 is 94, which is much more interesting. 375 (the width of a floor) tiles / 4 tiles for the smallest unit (a single hotel room or parking stall) gives 93.75 maximum units on a floor. So, maybe that 188 bytes is a 2 byte count, and up to 93 remapping pointers?

Doing them in order allowed me to figure out the indices, but now I have the issue that they’re not correctly re-mapped. So how does the re-mapping table work? There isn’t a initial count byte, so it looks like there are 94 possible 2-byte counts here.

After trying to figure out the mapping, I decided to move on, but will revisit the section. It’s tantalizingly close, but having more entries in it than units on a floor, as well as repeated entries, suggests that it’s not as straight forward as I’d hoped.

Elevator Data

Next thing in the data we have is a repeated data structure composed of a 194 Byte read, a 480 Byte read, two 120 Byte reads and then up to 29x 324 Byte reads and up to 8x 346 Byte reads. This was all repeated 24 times for a building full of elevators. The game allowing only 24 elevators is a well known limitation, so this is clearly all of the elevators.

Normal and service elevators can be 29 floors tall and have a maximum of 8 cars, so this these segments likely store that information about an elevator shaft. Further confirmation that this is indeed elevators. For a tower with fewer than 24 elevator shafts, only the 194 Byte header exists, so somewhere inside this block is metadata related to height. 120 Bytes is probably 120 flags related to floors.

Looking at a building full of elevators, I can see that a maximum height elevator has 29 of the 324 Byte entries and minimum height elevator has 1. I can also see that the last 8 bytes in the header are an elevator car’s home floor, with the value being the start floor of the elevator if those cars don’t exist.

Further investigation reveals a count of cars, starting tile from the left the elevator is on, and the top and bottom floors. But what did I do to figure this out? I made a change, such as adding another car or changing the elevator height to see what changed.

Sure looks like elevators. Black is normal elevators, red is service and blue is express elevators. Express elevators have a little different format. They only have a floor data structure for each floor that can stop at, not all of the floors they cover. This is likely because in the game, they only stop on underground floors, and at floors 1, 15, 30, 45, 60, 75, and 90 (which all can be sky-lobbies).

From there, the only other data in the header that isn’t determined yet is a 56 byte segment in the middle. There appear to possibly be 4 sub-sections that are 14 bytes each. One segment of is all 5, which is the default number of floors for an elevator car to service. Which means that this is storing the configuration of the elevator scheduler in the elevator properties window. Changing the settings confirms this, but there are only 6 periods to configure. This could mean that there’s a 7th hidden period, or more likely in my opinion, this was changed at some point in development.

With that figured out, the elevator header is done. But what about the next 480 bytes, or the two sections of 120 bytes? Those look suspiciously like information about floors. What happens if we take the info and assign every 4 numbers a colour and generate a 4 x 120 byte image for the 480 byte segment? This could be a status indicator of some sort for each car, so perhaps we need to split into 8x 4 bit values. Let’s see what that looks like too.

That certainly looks like it has something to do with elevator car statuses. But what exactly? The values don’t seem to match those shown in the elevator’s status section, but there is definitely a pattern there. I see repeated patterns at lobby floors, as well as underground (but not on B10, which can only be used for the Metro line’s tunnel). But also similar patterns between the express elevator and the normal elevator, especially where they overlap. Is this from people on those floors wanting to get somewhere? Something else? I’m still not sure, but being able to visualize data like this really helps.

Moving on, next I looked at the 324 byte floor data segment. The first thing I notice is that 324 bytes would be an even 80 entries of 4 Bytes, plus a single 4 Byte header, and this is exactly what this structure looks like. I could see that the values looked exactly like IDs in the people data segment. Closer inspection indicated that after the header, there are two independent segments. I noticed this because on the bottom floor of the elevator, one half was completely empty, with the same being true of the other half on the top floor. On floors not serviced, both were empty. But what about the header? It looks like 4 single byte values? A quick look at the game showed that the first and third values were the number of people waiting on the left and right of the elevator, or going up and down, respectively, and that this count capped at 40.

With the elevator data mostly sort-of decoded, I decided to move on to the next segments. I’m getting close to the end of the file, so there isn’t a whole lot else. I’m expecting data for the finance window, stairs and escalators and similar, though perhaps some of this is stored in that 490 Byte block at the beginning that I skipped.

The Next Segments

I can see a read of 88 bytes, of 132 bytes, of 12 bytes and of 42 bytes. SimTower does use 32 bit integers is some places, but the game is still really a 16 bit game, so even in places the save uses 32 bit integers to store the data, the numbers never get that large. This means that something that looks like XX XX 00 00 in the game file is usually a giveaway to interpret this as a 32 bit integer. The first and second entries look like this, while the third and fourth don’t.

The second entry has numbers that match the finance window, so that’s easily decoded.

I got sidetracked while looking at that segment, and I found that the next segment, at 1026 bytes long, had an initial value that was what looked like a 2 byte count, and then an increasing index up to that count. I looked at what else shared that count, and it was the number of parking stalls in the tower. So this stores some information related to the parking stalls. Once I got past the basic structure, which appears to be a count of connected stalls, verified by removing a stall and seeing that the count of stalls with red ‘X’s in them was subtracted here, the rest appeared to be a 2B index value. However, once I removed and added a stall, and checked, the values got a bit weird, so I’m not sure what this does.

Next comes a 22 Byte long block, which is mostly empty, so maybe this is padding or a placeholder?

After that comes 64x 10 Bytes blocks. There are 64 elevators and escalators in the game, so that’s what’s stored here, as there isn’t much else left in the file, and I haven’t found it anywhere else yet.

Looking at the actual structure, the first byte is 01 if there’s a set of stairs of escalator built, and the second appears to indicate what is built. Interestingly, 0 is escalator, so maybe they were added first. There are 6 total values for the each of the stair and escalator variants, total. The next two bytes are the same for all the stairs/escalators in one test tower, and it appears to be how far from the left side it is. The next byte is the start/bottom floor, though this is potentially two bytes.

The next set of two bytes, or single byte and second padding/other byte, appears to be the count of people going up and down the escalator respectively. How did I figure this out? Well, I guessed that the number of people shown in the game must be stored somewhere, and like the elevator cars, the total number of people should be stored inside this segment. But I had an escalator with 14 people on it, and I didn’t see 14. After staring at it a bit, I realized that 9 and 5 equal 14. Sure enough, this value matched on all the stairs and escalators I checked. I figured out the direction by looking at the counts first thing in the morning just after the fast food places opened, and people were only going up from my floor 1 lobby, via escalators, to them.

Final Bits

After the escalator/stairs section are 8 segments of 484 Bytes each. This looks suspiciously like a 4-Byte header, and 120 entries afterwards, one for each floor. Each entry might be a 4B value, but it could also be 4x 1B or 2x 2B. I didn’t have much luck decoding this one, other than to note that the first 4 Bytes are a header, because it’s a specific value if the rest of the entry is empty, and that the 120 values don’t look like 4 Byte integers. I’ll need to poke at this some more, but it looks like something that maybe isn’t exposed directly in the game and is instead internal simulation related.

Next are 10x 2Byte entries. My first thought it security offices, as there are a maximum of 10 of these. There are also a maximum of 10 medical clinics, but security offices are treated differently by the game, so it makes sense that these would be noted separately, even though they’re stored in the Units Data section as well. And sure enough, each value if either -1 or the floor that a security office is built on.

There’s still some more sections to go, I see a bunch of 6 Byte long entries, 10x 4 Byte entries (medical clinics), 16x 12 Byte entries, a 80 Byte entry, a 40 Byte entry. After that are three entries that seem the same length in a few towers I looked at, which are a 4,354 Byte entry, 2,114 Byte entry and a 3,234 entry. I have no idea what these store, but it’s probably more internal simulation state as a cursory inspection didn’t really reveal any structure, but a more thorough investigation may show something.

At the very end, there’s an 8 Byte read (of what seems to be empty data) and then 16B entries for named entities in the game. I’m not entirely sure how the entries are mapped, but the first entries are for named units, and the rest are for named people in the tower.

Ending Thoughts

I was very quickly able to determine most of the overall structure of the file, but things got more difficult towards the end of the file where there were lots of blobs that weren’t structured in a way that made their usage apparent.

My approach of figuring out the reads the game was doing and then looking at the data those reads contained really helped. I’ve looked at newer games that just load the entire file into memory and parse it, and they’re much more annoying to reverse engineer.

I’ve also poked at reverse engineering other games that I was less familiar with, and knowing things like there can only be 512 commercial units in a building really helped when I had a section that was a multiple of that length long.

There’s still a lot to be determined, and lots of unknown values sprinkled in the documentation, but overall, I got a large proportion of the file format figured out. As was the case for my SimCity 2000 city format reverse engineering project, the first 90% takes 10% of the time, and the remaining 10% takes 90% of the time.

I’ll need to decide whether or not I’m interested in grinding out more of the documentation on the format, but the documentation is open source on GitHub, so other people can always use it as a basis and open pull requests if they discover something new. But there are still parts I skipped that seem like they’d be relatively easy to do, so I’ll probably do some more work on this before I set it aside for whatever my next project it.

Or I’ll start a re-implementation project of the game. No guarantees…

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s