*-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* + + * The Definitive Guide to ROM Hacking Tables * + + * v1.0 * + + * by InVerse * + + * 09/03/01 * + + *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Introduction: As the title suggests, this document is a highly detailed guide to the creation and use of table files for the purposes of ROM hacking. Table files are one of the most integral and basic items needed for ROM translation and text modification and are not optional if you hope to move beyond the most crude levels of ROM hacking. This document will explain how to create table files for ROMs from any system in both English and Japanese (or any other language you might desire.) *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Table of Contents: I. Introduction II. Table of Contents III. Document History IV. Required Tools V. Getting Started VI. Finding Font Values with Nesticle VII. Finding Font Values by Relative Searching VIII. Building Japanese Tables IX. Control Codes X. Anomalies XI. Compression XII. Table Making Programs XIII. Conclusion XIV. Credits XV. Resources XVI. Contact Info *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Document History: 07/15/01 - v1.00 - Initial Release *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Required Tools: Regardless of which system the ROM you're planning to hack belongs too, most of the core utilities will remain the same. The only thing that should vary between systems is the emulator you use to view the ROM. Note that all utilities mentioned in these lessons are specific to Windows and/or DOS. There are very few ROM hacking related utilities for MacOS, Unix or other operating systems. If and when such tools are developed, the techniques should remain the same, only the execution will differ slightly. For the purposes of the initial lessons, you'll need: * Nesticle (Dos or Windows version) * A Hex Editor geared toward ROM hacking (Hexposure recommended.) * An NES ROM (Super Mario Bros 2 will be used for the demonstration.) Later lessons will also require the following: * A Tile Editor (Tile Layer for DOS recommended.) * Relative Search To follow the section on creating Japanese tables, you'll need: * NJStar Communicator Japanese Word Processor * NJWin Internet Viewer (or some other way to view Japanese) (Free demos of these programs can be downloaded from www.njstar.com.) There will also be a section covering the following programs: * Table Auto-Generator * Table Maker (It won't be necessary to complete this section to fully understand tables. These programs simply automate a few of the processes that you will learn in these lessons.) *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Getting Started: The first thing you need to understand is what a table file is and what it does. Each font character in a ROM has a specified hex value. A normal hex editor will show you the hex values and their associated ASCII values. That's all good and well if the ROM you're wanting to hack happens to have it's font characters set to their corresponding ASCII values, but that rarely happens. A hex editor that is geared towards ROM hacking will read a specified table file and then display the corresponding font characters with their associated hex values. So, for instance, if the capital letter 'A' is equal to hex value 74h in the ROM, it will display A on the right column display instead of lowercase letter 't' which has an ASCII value of 74h. This is the table file for the Super Mario Bros 2 ROM: D0=0 D1=1 D2=2 D3=3 D4=4 D5=5 D6=6 D7=7 D8=8 D9=9 DA=A DB=B DC=C DD=D DE=E DF=F E0=G E1=H E2=I E3=J E4=K E5=L E6=M E7=N E8=O E9=P EA=Q EB=R EC=S ED=T EE=U EF=V F0=W F1=X F2=Y F3=Z F4=- F5=? F6=. F7=? FB= The first column is the hex value. The second column is (obviously) an equals sign and the third column is the font value. So in Super Mario Bros. 2 the letter Z is equal to the hex value of F3h and the number 4 is equal to hex value D4h. The last value FBh appears to be blank and that's exactly what it is, FB is the value of the space that appears between words. Once you have your font values, actually creating the table is easy. You just type in each hex value, an equals sign and then the corresp- onding font value. There are also a few special functions that can be included in a table file (such as line breaks) but I'll wait to discuss these later. A table is nothing more than a text file, so you can create it in Notepad or whatever text editor you prefer. Once you have the values entered, save the file as .tbl. So if your Super Mario Bros 2 ROM is named smb2.nes, you would name the table file smb2.tbl. While it's not absolutely necessary to match the ROM and table names, most hex editors will automatically search for a like named .tbl file and open it along with the ROM. So how did I find the hex values of each character? Well, in this case I was fortunate and was able to use the simplest possible method for doing this which is covered in the next section entitled "Finding Font Values with Nesticle." *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Finding Font Values with Nesticle: This is the absolute simplest way to obtain the hex values of each font character. Unfortunately, it will only work on ROMs that can run in Nesticle, which means only NES ROMs and not even all of them. It's the best way to initially build table files though so you understand exactly what you're doing before moving up to the more advanced methods. Start by opening the ROM you want to hack in Nesticle. Once it's running, pause the ROM (ALT-P) and then press F2 to view the pattern editor. Somewhere on the screen you should see the font. In Super Mario Bros 2, you'll find it down in the bottom right corner. Click on the number 0 and you'll see a little window pop up in which you can edit the character. More importantly, you'll see a 2 digit hexadecimal value in the upper left corner of that window. This is the value of that particular character. So the value of 0 is D0h, P is E9h and ? is F5h, just like in the example table above. If you don't see the font, it could mean one of 2 things. First, it might just be that the font isn't currently in memory. Nesticle's pattern editor only displays tiles that are currently loaded into memory. In Super Mario Bros 2, the font isn't visible when there is no text on the screen. Try playing the game for a bit while you have the pattern editor open and you may see it appear at certain parts of the game. Another thing that could cause this to happen is if the font is compressed. This will be discussed in further detail later in this document but you should know that if a font is compressed, it will likely take someone skilled with assembly level programming to decompress it. That's not something you can simply learn with a document like this one. So you should now understand how I obtained the values for the Super Mario Bros. 2 table and you should also be able to create your own tables for almost any ROM that will run in Nesticle. But what about NES ROMs that won't run in Nesticle or ROMs that are for other systems all together? Continue on to the next section entitled "Finding Font Values by Relative Searching". *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Finding Font Values by Relative Searching: Unfortunately, Nesticle is the only emulator which features a pattern editor. When you want to create a table file for a ROM that won't run in Nesticle, it's time to move up to the next level and do a relative search. Even though it runs in Nesticle, we will continue to use the Super Mario Bros 2 ROM in this lesson so that you can instantly compare your relative searching results with data that you already know to be true. Before you begin your relative search, it's a good idea to open the ROM in a tile editor to see if the font is visible. I suggest using Tile Layer for DOS for the purposes of this lesson as it's easy to use but very powerful. Open Super Mario Bros 2 in Tile Layer and near the bottom, you'll find the font. Actually, you'll find multiple fonts but we'll discuss that later. If you don't see a font in the ROM you're wanting to hack, then it maybe be compressed. I'll discuss compression in-depth in Section X: Compression. Relative searching is an aspect of ROM hacking that seems rather complicated at first glance but is actually extremely simple. A relative searching program looks for relative changes from one hex byte to another. Rather than trying to explain this in overly technical terms, I'll give you an example. Here is a chart of the English alphabet in it's regular position: A B C D E F G H I J K L M N O P Q R S T U V W 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 X Y Z 24 25 26 Let's say we want to search for the word SELECT which appears in the player select screen of Super Mario Bros 2. Knowing the actually hex values isn't necessary for relative searching but knowing what order the font is in does. In this case (and most cases) the font remains in alphabetical order so the chart above will apply. To relative search for SELECT, we need to determine the letters relative positions to each other. Use the chart above to do that and you'll see that SELECT's relative value is 19 05 12 05 03 20. This means that the hex value of the 2nd character will be 14 less than the first character, the hex value of the 3rd will be 7 more than the 2nd, the hex value of the 4th will be 7 less than the 3rd and so on. What a relative searching program will do is look for a place in the ROM where the hex values match this pattern. It doesn't care what the actual hex values are as long as the 2nd is 14 less than the first, the 3rd is 7 more than the 2nd, etc. Open Relative Search and the first thing it will do is ask for the name of the file you want to search. Type in the ROM name (make sure that it's in the same directory) and it will ask for the first byte. Type 19 and press enter. Now enter the rest of the bytes from above and then press enter. You'll see Relative Search display how many individual change were found and then it will announce that it found a complete match of all changes at 1DC4Ch. Open your ROM in Hexposure, hit F1 to jump to a specific offset, type in 1DC4C, hit enter and you'll see that the word SELECT does indeed appear there. If you didn't already have a table built for this ROM, you would now know the values for the letters C, E, L, S & T and it would be easy to figure out the rest of the table values from that. Now that you know at least some of the hex values for the font characters, you can start building your table file accordingly. Here's the part that some people have trouble grasping. It doesn't matter what order your font is in, if you can view it in a tile editor then you have all the information you need to relative search. Let's say you have a disordered font that looks like this: C 7 T o R & 0 a t How could you relative search for that? Simple, you just start numbering from the first character just like you would if it were in order: G 7 T o R & 0 a t 01 02 03 04 05 06 07 08 09 Let's say you wanted to search for the word Goat. The relative order for Goat would be 01 04 08 09. If you had a ROM with a font in that order and typed those 4 bytes into Relative Search, you'd find the position of the word Goat (assuming the game contained the word at some point.) One thing you should know is that Relative Search may sometimes find more than one match. This means that relative pattern appeared more than once in the ROM. This will happen more frequently when you're searching for shorter strings. If this happens, you'll simply have to test each instance until you find the right one. The simplest way to do this is to change the first hex value in the string by one and then looking at the text in an emulator to see if it changed the first letter in that word. You can expand your search in Relative Search by using * as a wildcard. Let's say you entered the values of 01 * 07. Relative Search would look for any 3 hex value string where the 3rd value is 6 more than the 1st, ignoring the second all together. If you know the hex value of some characters but not the others, you can input the actual value instead of a relative position. Let's say you somehow know that the value of S is 2Fh. You could limit your relative search by inputting these values into Relative Search: 2Fh 05 12 05 03 20 Now, Relative Search will only look at instances where the first hex value is 2Fh and the rest match the relative pattern. Now that you understand the mechanics of relative searching, there is an easier way to do it. Some hex editors (including Hexposure & Hexecute) have built in relative searching capabilities. Open a copy of the Super Mario Bros 2 ROM in Hexposure. First, rename the ROM so Hexposure won't automatically open the table file a well. Now press F6 to relative search and it will ask for the text you wish to search for. Type in SUPER, hit enter and you'll be asked if you want to build a table based on the results. Say yes and you'll see that Hexposure has successfully found your text. Press F9 if you want to save your new table. That's a much easier method, of course, but let me explain why you won't always want to use it. You can only search for alphabetical characters in this manner. With Search Relative, you can search for a combination of letters, numbers, punctuation and even scrambled garbage if you like. Also, Hexposure has a bug that will prevent it from finding any results unless your search term only includes one case. That means, you can only search for all UPPER CASE or all lower case words. If you have a word that starts with a capital letter and then the other letters are all lowercase, just leave off the first letter and search for the rest. You should now have a basic understand of relative searching and how to build a table file with it. These are the most basic of table building skills and all you need for very simple ROM hacking. The rest of this document will cover slightly more advanced subjects such as how to create a table for a Japanese ROM, an explanation of control codes and a section detailing various anomalies that could cause you difficulty when trying to build a table file. *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Building Japanese Tables: In general, Japanese tables are constructed in the same manner as English tables with the obvious difference being the language. Your computer most likely doesn't have a keyboard with the Japanese character set, so you're probably wondering how you can type the characters you'll need to create a Japanese table. You'll need two programs. The first is NJStar which is a Japanese word processor and the second is NJWin which is a program that will allow you to view Japanese text in almost any application. Both of these programs are available as a free demo from www.njstar.com. There are 4 different forms of Japanese writing. The first is kanji which is the Japanese written language. Kanji is what most people are familiar with and consists of several thousand different pictographs. I'll discuss kanji further later in this section but for now, we're going to focus on kana. Kana can best be described as the written version of the Japanese spoken language. Kana is similar to our alphabet in that each character represents a specific sound and multiple characters are combined to form words. There are two forms of kana, hiragana and katakana. In the contemporary Japanese language, hiragana is most commonly used to write native words and katakana is mostly used to write words borrowed from other languages as well as the names of foreign persons and places. The fourth form of Japanese is romaji which is basically Japanese written in the same romanic character set as the English alphabet. Just as our alphabet has a specific order that it's usually written in, so does the Japanese alphabet. Here is the order that kana most commonly occurs in, written in romaji: a i u e o ka ki ku ke ko sa shi su se so ta chi tsu te to na ni nu ne no ha hi fu he ho ma mi mu me mo ya yu yo ra ri ru re ro wa wo n' Open up NJStar, create a new document and type the letter a. What you'll see is the hiragana character that corresponds to the a sound. In the bottom-right corner you'll see four buttons with Japanese on them. The second button should be dark and the other 3 light. This means you're typing in hiragana mode. If you click the first button, it will darken as well and you'll be typing in katakana mode. Switch to katakana mode and type another a and you'll see a different character. The hiragana 'a' looks something like a loop with a cross above it whereas the katakana 'a' looks slightly like a deformed 7. Both characters are pronounced the same way, it's just their usage that differs. There's also a button on the bottom-left that contains Japanese text. Click it once and you'll see it display the word ASCII. This means you're typing in standard ASCII mode so if you press the 'a' key, the normal letter 'a' will appear in NJ Star. Click the button again and you'll switch to romaji mode. When typing in romaji mode, nothing will instantly show up in the text field. Instead, you'll see a series of buttons containing Japanese characters that match what you have typed so far. Click the button once more to return to kana mode. You can download complete hiragana and katakana tables from various sources but I'd recommend creating your own so you have some practice with Japanese. Just replicate the romaji table from above and type the hiragana equivalent directly below each romaji entry. Then do the same thing for katakana. You can also view a hiragana table by clicking on the Help menu in NJStar and selecting Kana-Romaji Table. Now you know how to type Japanese characters, so you're probably wondering how to create a Japanese table. Well, you find the hex values of the Japanese characters in the same manner you find the values of English characters. If you're using a ROM that's playable in Nesticle, you can use the Nesticle method. Otherwise, look at the font in a tile viewer and relative search accordingly using Relative Search. There's also a program named Romaji Search that you can use to search for Japanese text directly (much as you relative search in Hexposure by typing the word you're searching for instead of it's relative value) but this program isn't recommended to anyone who doesn't have a firm understanding of how kana works. Once you know what hex values correspond to what Japanese characters, you can start building your actual table. Go into NJStar and start typing your table exactly like you would an English table except that you'll switch to kana mode to type in each font character. This can be a tedious process but once you've done a table or two and understand how it all works, you can start using a table making program to do the grunt work for you. (See Section XII: Table Making Programs) Once you've finished your file, go to Save As and save your table, making sure to select one of the Text File formats so your table will be readable by standard programs. Note: There are multiple methods of encoding Japanese text. The two most prevalent are Shift JIS and EUC. In my experience, most people prefer Shift JIS so that's what I'd recommend using. You can always resave the file in EUC format if you find out you need to use EUC for some reason. Once you've created your Japanese table and saved it in text format, you can open the ROM and table in Hexposure just as you would an English ROM and table. The only difference is that Hexposure doesn't have the ability to display Japanese characters. That's where NJWin comes in. If you haven't already, run NJWin and set it to Japanese Auto-Detect. This will automatically display the Japanese properly regardless of the encoding used. If you only want to view Shift JIS or EUC, you can select the specific encoding type you prefer. Once you have NJWin running, you'll be able to see the Japanese characters within Hexposure. (You'll also notice that it changes the border around the program into Japanese but that's just a quirk do to the ANSI codes used to create the border in Hexposure.) If you were to open your new Japanese table in Notepad (or any other non-Japanese text editor) while NJStar is running, you'd be able to see the Japanese characters there as well. Without NJStar running, you'll simply see scrambled garbage. That covers everything except the dreaded kanji. Actually, kanji isn't much worse than kana, it's simply a lot more time consuming. Kanji pictographs don't represent words like kana and romanic characters do. Instead, each pictograph represents an entire word or concept. As a result there are a lot more kanji symbols than kana. Whereas there are only 46 different characters for hiragana and another 46 characters for katakana, there are roughly 50,000 different kanji pictographs. Hopefully, you'll understand why I don't include a kanji table to go along with the romaji table above. So how do you go about finding 50,000 different hex values? Well, first of all, you'll never find a game that actually uses every kanji or even a large percentage of them. Some games will only use a few dozen, some (especially RPGs) will use hundreds of pictographs and I've heard of a few occasions where a game used over 1,000 kanji. As I said before, kanji isn't exceptionally more difficult than kana but it is a lot more time consuming. To find the hex values of kanji you simply look at the layout order of the kanji in a tile viewer and then relative search accordingly. Unless you only have a very small amount of kanji, they'll almost certainly be stored with Multi-Byte Encoding (see Section X: Anomalies below.) Once you've determined the hex values for the kanji, it's time to input the kanji into your table. There are basically three methods you can use to input kanji. The first method is the fastest but unfortunately requires that you have advanced knowledge of the Japanese language. If you can actually recognize kanji on sight, simply kick NJStar into romaji mode and type away. For those of you less Japanese inclined, there's the radical lookup method. This still requires some practice but doesn't actually require that you learn the meaning of even a single kanji. Radical lookup basically consists of identifying the primary bushu (piece) of the kanji and counting the number of strokes in the bushu. Then go to the Input menu in NJStar and select Radical Lookup. Locate the primary bushu on that table (they're ordered by number of strokes) and then search through the row at the top for the kanji in question. You can limit your choices by selecting a range of strokes from the dropdown menu. (Note: This is the total number of strokes in the entire kanji, not just the primary bushu.) Once you've located the kanji in question, simply click on it and hit the Insert to File button. Now you can see why kanji input requires so much more time than kana. I'm also having trouble figuring out the proper way to word this explanation, so hopefully you can understand what I'm trying to say. The third method is a bit less brain taxing but still requires a lot of time and won't produce 100% accurate results. This method is known as the Kanji OCR Method and requires the program Kanji OCR. Unfortunately, Kanji OCR is a commercial program without a free demo, so you're on your own to find a copy of it. I'm not going to go into great detail concerning this method because you can learn most of what you need to know by reading the documentation included with the program, but, in short, you take screenshots of all of the kanji contained within your ROM and then scan the graphics with Kanji OCR which, if all goes well, will output the kanji in a text file format you can use to create a table. A couple of tips regarding the Kanji OCR Method: Use a .bmp or .ras graphics format, make your graphics monochrome and if your kanji are close together, you should use a graphics editor to space them out so as to receive a more accurate reading from Kanji OCR. In general, this will result in 80 to 85% of your kanji being identified correctly and you'll have to input the other 15% to 20% via one of the other 2 methods described above. So that should cover everything you need to know to make Japanese table files. The kanji methods could also be applied to Chinese language games as well (there are a few out there) and even Korean games if such exist, provided you use software that corresponds with that language. NJStar also comes in a Chinese language version and NJWin will display Chinese and Korean in to Japanese. I apologize for my butchered attempts to explain the basics of the Japanese language. See Section XV: Resources for much better sources of education regarding the Japanese language. *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Control Codes: Once you've located all of the font values, there's one final piece left to complete your table, the control codes. Control codes are simply codes that control how the text is displayed. For instance, some ROMs might automatically wrap text to the next line once it reaches the end of a line but others will require a linebreak control code that specifically says to start displaying text on the next line. Without this line break, the text will go off the side of the screen, shoot out of your monitor, spill all over your desk and possibly poke you in the eye. You certainly don't want that to happen. So how do you find the control codes? Well, unlike with fonts, there aren't any techniques for searching them out. You simply have to experiment with the ROM. Open the game you're wanting to hack in a hex editor and locate some text that's early in the game (so you can easily test it). You'll need to find some text that takes up at least 2 lines in the game. Look at that text in an emulator and take note of where the text jumps to the second line as well as where it ends. Now look at that text in the ROM. Is there one or more hex values between the last character of the first line and the first character of the second line? If so, that's probably a line break control code. Find out for sure by changing the value and seeing what it does to the text in the game. If the text no longer jumps to the second line and instead scrolls off the side, you've found a line break. Not all games make use of control codes, however, so you'll just have to experiment to find out whether or not the game you're wanting to hack does. Take a look at the end of that group of text as well. If there's an unaccounted for hex value between the end of that group of text and the beginning of the next group, that may be a string break control code. Once again, try changing it to something else and see what occurs within the game. If text from the next section spills over into the section you're working on after changing the value, it's most likely a string break. Hexposure has built in support for some control codes. To indicate a control code in your table, you place the symbol for the control code followed by an equal sign and then the hex value for that particular code. Note that the hex value is on the opposite side of the equal sign when dealing with control codes that have built in support. This informs Hexposure that the value is a control code and not a font character. Here are the control codes supported by Hexposure: * indicates a line break control code. \ indicates a section break control code. / indicates a string break control code. So a table that includes control codes would look similar to this: *=00 (A line break control code with a value of 00h.) \=01 (A section break control code with a value of 01h.) /=02 (A string break control code with a value of 02h.) (The above text contained in parentheses is simply for explanatory purposes and shouldn't be added to your table.) Other programs may support some build in control codes as well. Check the documentation for that particular program to be sure of how to specify control codes in your table file for that program. There are several other possible control codes, depending on the way the game you're hacking was programmed. One common control code is the 'pause' code that causes the text to pause until the player hits a button. This usually happens when the text box is full but there's still more text to display, so the text display pauses (usually with a flashing cursor at the end of the text to indicate more text) until the player presses a button to cause the rest of the text to display. Another might be a specific hex value that prints whatever the player named their character at the beginning of the game. If your game contains a control code that your hex editor doesn't have built in support for, you can still work it into the table. Simply choose a character that's not used in the game and add it to your table with the corresponding hex value of the control code in questions. For example, if your game has the 'pause' control code mentioned about, you could set a tilde (~) equal to the control code's hex value. Then, you could press the ~ key to insert a pause control code. It will display as ~ in the hex editor but the game itself doesn't care what's in your table file, only what hex values the ROM contains, so when you encounter that place in the game, the game will pause as expected. Note: You might hear control codes referred to as Ballzy tables on rare occasions. If you're new to ROM hacking then you're not going to understand why. Just remember that Ballzy tables are control codes and you'll be fine. *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Anomalies: As with just about anything, there are always anomalies. The exception to every rule, as it is. This section will attempt to document several of these anomalies. Multiple Fonts -------------- It's not uncommon to find a ROM that contains multiple font sets. There are two different instances where this might happen. In the first instance, there are multiple copies of the same font and they all have the same corresponding hex values. In this case, you don't need to do anything different with your table file as it's accurate for all the fonts. The only explanation I can think of for why this occurs is lazy programmers. The second instance where you'll find multiple fonts is when there are two or more fonts and they have differing hex values and possibly even different characters. This most commonly happens in RPGs where one font is used for the dialogue text and a different font is used for the menus but it can happen in non-RPGs as well. In this case, you'll have to create a separate table file for each font set and then switch between tables when you want to edit text that's covered by the table you don't have loaded. All of the techniques described in this document still apply, you simply have to do double the work. If you're able to see some of the text in your ROM but not other parts of the text, it could be do to multiple font sets. Non-linear Font Sets -------------------- If you're lucky, your font sets will always be nicely alphabetical with numbers and punctuation following the letters. If you do much ROM hacking, however, you'll eventually encounter a ROM that doesn't store it's font in such a convenient manner. When relative searching, this will effect the numbers that you input for the search. You'll have to open your ROM in a tile viewer to find out exactly how the font is stored. There are several different ways that a ROM might store a font non-linearly. If there's not a lot of text in the ROM, the coders might have only inserted the letters that the game uses. I've also seen games that contain various graphical data interspersed with the letters. You might see something like this: ABCDEFGHIJKLMNOPQRSTUVWXYZ I've even seen a few games that store the font in the following manner: AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz So if your relative searching isn't working, take a look at your ROM in a tile viewer to make sure it's not due to a non-linear font set. Odd Control Codes ----------------- The Legend of Zelda is a classic example of a game that uses "odd" control codes. While it has a normal font set, it doesn't use normal control codes. Instead of having a separate hex value/control code for a line break, it has a whole set of values that are equal to a letter of the font *and* a line break. So to display the letter A, you might use the hex value 0Ah but if A appears at the end of the first line it might use 4Ah and if it appears at the end of the 2nd line, it might use 8Ah. (Note those aren't necessarily the correct values, they're just an example.) So if you can find the text in your game but the last character of the line is missing, that might be why. I've also heard of one or two cases where a ROM had separate hex values for a character followed by a space. In cases such as this, all you can do is experiment with the ROM to determine the correct hex values. Multi-Byte Encoding ------------------- If you're working with a Japanese ROM that contains a lot of kanji, you're likely to run into 16 bit hex values. There are only 256 possible 8 bit (2 digit) hex values 00 through FF. So what happens if a game has more than 256 font characters? That's where 16 bit values come into play. Using 16 bit (4 digit) hex values, you have 65,536 different combinations available from 0000 to FFFF. If you can't find any text via relative searching and you suspect the ROM might use 16 bit values, simply try searching using wildcards. For example, if you believe the text you're searching for should be in the relative order 5 17 12 22 11, you can check for 16 bit values by searching for 5 * 17 * 12 * 22 * 11 which will allow for any hex value to be in the place of the *. You could also, on occasion, run into 24 bit hex values which would consist of 8 digit hex values ranging from 00000000 to FFFFFFFF. You treat 24 bit values exactly like you do 16 bit except that you'd use 2 additional wildcards per character when relative searching. *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Compression: Compression is the bane of all ROM hackers. If you run into compression, you have no choice but to give up your project. Actually, that's how it was in the old days but many techniques have been developed for defeating depression since then. This guide is strictly on how to build table files, so you won't learn how to decompress a ROM here but to be thorough, I felt it best to cover each of the types of compression and how they affect your table. There are two types of compression, font compression and text compression. Text can be compressed in all 4 of the methods described below. Fonts will only fall under the category of "true" compression. A ROM can contain either or both font & text compression. The simplest way to tell if your ROM contains a compressed font is to open it in a tile viewer. If you can't find the font, it's probably compressed and you should move on to another project until you've picked up more advanced ROM hacking skills. The rest of this section will explain how to detect and overcome text compression. DTE Compression --------------- Dual Tile Encoding is a somewhat common compression scheme in which a single hex value will reference 2 or more font characters. So instead of 00=A you might have 00=Ar. The Final Fantasy games almost all use DTE compression. If you ROM uses DTE, you'll have to include the DTE values in your table. Here's a small portion of the Final Fantasy (US version) table as an example: 1A=e 1B= t 1C=th 1D=he 1E=s 1F=in Most games that use DTE will still have a regular one hex value to one font character value set as well. To determine the DTE values, first locate some non-DTE text within the ROM and then use trial and error to find the DTE values. A little common sense will limit the amount of time you spend with trial and error, however. If you see this text in your ROM: S e the pri ess! Then you should be able to recognize that DTE is used for the letter combinations 'av' and 'nc' and find their values accordingly. Dictionary Compression ---------------------- Dictionary compression is another form of simple compression that involves the ROM containing a bank of commonly used words all in one spot. Each of those words is assigned it's own hex value (generally a 16 bit value) and when it's time for that word to appear in the text, the hex value for the word is used instead of the hex values for each individual letter within the word. For example, let's say the ROM you're working on uses dictionary compression and one of the words within the wordset is princess which has a hex value of 007A and another word within the wordset is dragon with a hex value of 0082. Now let's say the ROM contains the sentence "Please rescue the princess from the dragon." In an uncompressed ROM, you'd simply see the hex values for each individual letter but because this game uses dictionary compression, instead of seeing a string of 8 hex values, you'll see the 16 bit hex value for the word instead, sort of like this: Please rescue the 007Ah from the 0082h. This is somewhat of a bad example but hopefully you understand what I'm trying to explain. Once you have a complete table with all of the wordset values, you'll see the word 'princess' in the game text but in the hex, you'd see the value for t, value for h, value for e, value for [space] and then 007A followed by the value for [space], value for f, value for r, value for o, value for m. To create a table with entries for the wordset values, start by making a regular table. Then open the ROM in a hex editor and look for the wordset. See what the first word is and then find a place where it's used in the normal text. For instance, using the sentence from above and a table that has the font values but not the wordset values, you'd find a line that looks like this: Please rescue the from the . Look to see what hex value is between the hex values of the & from and you'll have the value for the word princess. Knowing this, you would add the following entry to your table: 007A=princess Assuming the programmers of the game weren't demented, the next word from the wordset should have a hex value of one more than the first word. So if your first word has a value of 007A, the second would should have a value of 007B. This will allow you to make your table much faster as you won't have to check every single value. If it's not setup like this, you'll have to find the value for each word in the same manner that you used to find the first word. Substring Compression --------------------- Substring compression is the same as dictionary compression except that instead of assigning a hex value to each word, the ROM uses pointers to indicate which word to display. Pointers aren't a part of table building and are therefore beyond the scope of this document. There are plenty of pointer docs available for you to read but you should learn the basics of ROM hacking before attempting to learn how to hack pointers. True Compression ---------------- Not to insinuate that the forms listed above aren't real compression but they can all be hacked by someone with no discernable programming skill. "True" compression requires some semblance of Assembly coding ability with the applicable Assembly language. For NES, that would be 6502 Assembly. SNES uses 65816 and Gameboy uses z80. If you're new to ROM hacking, it's best to just flat out avoid Assembly hacking. You need a very strong grasp on ROM hacking techniques before it will do you any good to try and ASM hack a ROM. Not to say that you can't learn to do it but it's certainly not within the scope of this document and it's something I don't know myself. *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Table Making Programs: There are a few applications that will make your tables for you. Unfortunately, they can't find the hex values for you, they can simply take the data that you uncover and place it in table format, thus saving you a lot of tedious, repetitious typing. There are half a dozen or so table making programs to choose from but I'm only going to cover 2 of them here. You're welcome to try the others and see if you prefer one of them more. Table Maker ----------- Table Maker by Jair (author of Relative Search) is a really simple program for quick table generation. When you first run Table Maker it will ask for the name of the table file you wish to create so type in the file name. Next select ASCII (standard English characters) or kana (Japanese characters). Finally, input the first hex value that will appear in your table. Table Maker will then display the hex value and wait for you to input the corresponding font value. Once you hit enter, it will move on to the next hex value. Press ESC once you're finished and it will output the results to the file you specified. Note: If you specify a file that already exists, Table Maker will simply append to the end of the file, not overwrite it. If you select kana as your input mode, you simply type in the romaji equivalent and Table Maker will convert that to the corresponding kana. Use lowercase romaji for hiragana and uppercase for katakana. You can also press TAB to switch between ASCII and kana mode. Note: Some people have had problems running Table Maker under Windows ME. Table Auto-Generator -------------------- And now for a bit of shameless self-promotion. TAG is a program that I wrote a couple of years ago because none of the table generators worked quite the way I would have liked. TAG is a bit more robust than Table Maker. It allows for standard English characters, hiragana & katakana and various others. There is a separate text box for each character and you can type each value in by hand (which somewhat defeats the purpose) or you can enter the value for the first character of each set and then press Generate and TAG will automatically fill in the rest of the values. Of course, this will only work if you have a linear font set. Tag also allows you to choose between Shift JIS or EUC as the encoding format. Table Auto-Generator is far from perfect, however. The most obvious bug is the fact that I somehow managed to forget to include a ? among the font characters (you can add one yourself in the Other section, though). There are also various minor bugs that won't actually affect the functioning of the program. I still have the source code for TAG so hopefully I'll one day be in a position to update it once again. So, as with every other type of program, the choice is up to you. Try them both as well as any others you might find and use whatever you like the most. Or even build your tables by hand, if you prefer. *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Conclusion: If you've actually read this entire document, then I'm impressed. I originally started writing it as an example of my work to provide to companies that were hiring technical writers so that's why it's a bit dry and lacking in my normal sarcasm. I still can't believe that I wrote the entire thing in 3 sittings, all be it with a rather long gap between the first and second sittings. If you can read this entire document and understand everything that's explained within, you should certainly be able to make progress in ROM hacking. While technical know-how is important to your success as a ROM hacker, patience is the #1 most important factor in whether you will succeed or not. So now that you know everything there is to know about building ROM hacking tables, what's next? Since you presumably want translate games from one language to another, the next subject I'd recommend studying is script extraction/insertion. This will allow you to pull the text from a game, have it translated into another language and then put it back into the ROM. Another subject that you'll want to become very familiar with is pointers. Understanding pointers will make your ROM hacking tasks much, much easier and will often increase the quality of your translations. There are several documents written on this subject and you should have no problems finding them at some of the ROM hacking sites listed in Section XV: Resources. Above all else, remember that this is a hobby and nothing more. Too many people (myself included) get caught up in the politics of "the scene" and forget that it's all about video games. So whatever you do, don't forget that. I mean, what's the point of a hobby if you're not having fun? *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Credits: I don't claim to have developed any of the techniques described in this document, I simply collected them all into a single reference. Special Thanks To The Following People: Snowbro - for creating Hexposure & Tile Layer / Tile Layer Pro. Jair - for creating Relative Search and writing a tutorial on it. Patrikus - for originally teaching me how to make tables. Neil - for information on relative searching. satsu - for information on MBE and substring compression. animefx - for information on the Kanji OCR method he developed. *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Resources: My Website ---------- Suicidal Translations - http://www.pigtails.net/ST Program Homepages ----------------- Kanji OCR - http://www.kanjikit.com Nesticle - http://bloodlust.zophar.net NJStar - http://www.njstar.com NJWin - http://www.njstar.com Hexposure - No Current Homepage Relative Search - http://fly.to/vale Table Auto-Generator - http://www.pigtails.net/ST Table Maker - http://fly.to/vale Tile Layer - No Current Homepage ROM Hacking Information ----------------------- RHDO - http://www.romhacking.org RPGd - http://rpgd.emulationworld.com SGC - http://sgc.jandar.net Whirlpool - http://donut.parodius.com Zophar's Domain - http://www.zophar.net Japanese Information -------------------- Japanese Online - http://www.learn-japanese.com Jim Breen's Page - http://www.csse.monash.edu.au/~jwb/japanese.html *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Contact Info: I'm not an exceptional ROM hacker. I wrote this document in hopes of helping some people learn to ROM hack and to save time in answering the same questions over and over again on messageboards. If you have questions regarding ROM hacking, you're going to get a lot more help by asking on a messageboard than you will by e-mailing me. If you find an error in this document, then contact me and I'll most likely fix it. Do NOT under ANY circumstances e-mail me asking to translate a game and if you e-mail me a ROM, you will suffer horribly. If you can comply with these rules, my e-mail address is inverse@pigtails.net The most recent copy of this file can always be found at my website Suicidal Translations (http://www.pigtails.net/ST) along with other documents and utilities I've written and my own ROM translations. -InVerse *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* `09/03/01 - The Definitive Guide to ROM Hacking Tables - v1.00 -=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* Contact Info: I'm not an exceptional ROM hacker. I wrote this document in hopes of helping some people learn to ROM hack and to save time in answering the same questions over and over again on messageboards. If you have questions regarding ROM hacking, you're going to get a lot more help by asking on a messageboard than you will by e-mailing me. If you find an error in this document, then contact me and I'll most likely fix it. Do NOT under ANY circumstances e-mail me asking to translate a game and if you e-mail me a ROM, you will suffer horribly. If you can comply with these rules, my e-mail address is inverse@pigtails.net The most recent copy of this file can always be found at my website Suicidal Translations (http://www.pigtails.net/ST) along with other documents and utilities I've written and my own ROM translations. -InVerse *-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-=*=-=-* `09/03/01 - The Definitive Guide to ROM Hacking Tables - v1.00