HARLEY HAHN'S USENET CENTER
File Sharing Tutorial
Earlier in the tutorial, we started with a file that we wish to upload:
Using WinRAR, we compressed and split this file into 43 RAR files:
In the previous section, we used QuickPar to process these RAR files in order to generate a series of PAR2 error correction files. The output of QuickPar was the following 10 files:
The goal of this section of the tutorial is to explain how these files are used and why they are named the way the are.
The Verification File / The Parity Files
Every set of PAR2 files consists of a verification file plus one or more parity files. The VERIFICATION FILE is used to test whether or not a set of data files is complete. The PARITY FILES contain information that can be used, if necessary, to fix corrupted data or to replace missing data.
PAR2 verification files are named according to the following pattern:
where basename is the first part of the name of the original file. (You can change the basename when you generate the PAR2 files.) In our example, the verification file is:
The PARITY FILES contain information that can be used, if necessary, to fix corrupted data or to replace missing data. PAR2 parity files are named according to the following pattern:
where xxx and yyy are both numbers. In our example, the first parity file is:
To make sense out of this pattern, you need to understand a bit about how PAR2 files are used.
How PAR2 Files Are Used to Restore Data
The Parchive system is based on the idea that, with enough parity information, it is possible to fix damaged or missing data. Parchive considers a set of data files (such as RAR files) as one large collection of data that it divides into fixed-size BLOCKS. As a Parchive program processes data (in our example, the 43 RAR files), it creates enough parity data to restore the various blocks. This parity data is then stored in PAR2 files.
With a minimum amount of parity data, you can (if necessary) restore one damaged or missing block. With a little more parity data, you can restore two blocks. With even more parity data, you can restore three blocks; and so on.
I mentioned earlier that a PAR2 verification file can be used to test if a set of data files is complete. As such, it is similar to the SFV file we discussed earlier in the tutorial. What makes a PAR2 verification different is that it can also indicate where the mistakes are and how many blocks of parity data are necessary to fix them.
Why is this important?
Let's say you have a fast, inexpensive, reliable Internet connection. You also have a modern newsreader program and access to a commercial Usenet server. When you download large binary files, you will probably download all the PAR2 files at the same time you download the data files.
After the downloading is finished, your newsreader will automatically use the PAR2 verification file to test the data. If necessary, the newsreader will then use the PAR2 parity files to fix the problems. (I'll show you what this looks like later in the tutorial).
However, what if you live in a place where your only access to the Internet is slow, expensive, and not always reliable? And what if you do not have access to fast, high-quality commercial Usenet server? In such cases, you will not want to take the time (or pay the money) to download PAR2 files unless you really need to.
The beauty of the PAR2 system is that, once you have all the data files, you need only download a single, small verification file to test all the data files. If they are okay, you are finished. If not, you will be told how many blocks of parity data you require. This enables you to download the minimum number of PAR2 files you need to fix the problem.
The question is, once you know how much parity data you need, how do you know how many PAR2 files you need to download? The answer is, you can tell by looking at the file names.
Understanding PAR2 File Names
As I mentioned above, PAR2 parity files are named according to the following pattern:
where xxx and yyy are both numbers.
The first number (xxx) tells you the number of the first block of parity data contained the file. The second number (yyy) tells you the total number of blocks in the file. For example, consider the file with the name:
This file contains one block, starting with block #0. (For technical reasons, computer programmers often number things starting from zero.) Now consider the file:
This file contains 13 blocks of parity data, starting from block number 15. That is, it contains blocks #15 through #28. The table below contains similar information about the PAR2 files we generated in our example in the previous section (1 verification file and 9 parity files). Note that the verification files does not contain recovery blocks.
Let me show you how these file names are useful.
Let's say you have downloaded all the data files in our example from the previous section of this tutorial. You then download the PAR2 verification file into the same directory (folder) as the data files. You the use the verification file to check the data. (If you have installed QuickPar on your computer, all you have to do is double-click on the verification file.)
If there are no errors, you are done. You don't need to download any of the PAR2 parity files.
Let's say there is a small amount of corrupted data, and you are told you need 1 recovery block to fix the problem. All you need to download is parity file #1, because it contains 1 recovery block:
On the other hand, what would you do if you were missing a large chunk of data, and you were told you needed 41 blocks of recovery data to fix the problem? In this case, you have several choices.
First, you could download parity files #1 through #6. Collectively, they contain 52 blocks of recovery data (1+2+4+8+13+24 = 28+24 = 52), which is more than the 41 you need:
Alternatively, you could download either the parity file #7 or #8 or #9, as each of them contains 41 recovery blocks:
Once you have the parity files you need, you can use the verification file to re-check the data files. (Just double-click, once again, on the verification file.) This time, because you now have the requisite parity files, the errors will be fixed automatically.
One last observation: Take a look at the table above and notice the size of the parity files: #1 is the smallest, #2 is larger, #3 is even larger, and so on, up to a maximum size. Do you see the beauty of this system?
The PAR2 design was created this way so that when a user with a slow Internet connections finds a data error, he need download only the minimum number of PAR2 data blocks necessary to fix the problem.
When sharing large binary files, PAR2 files are used to detect and fix data errors.
If you have a fast, inexpensive Internet connection, you will probably download all the PAR2 files. If so, your newsreader will use them automatically to check the data and, if necessary, fix any errors.
If you have a slow, expensive Internet connection, you need only download the PAR2 verification file, which you can then use to check the data. If there are errors, you need download only enough parity files to fix the errors. There is no need to download parity files you don't need.
© All contents Copyright 2013, Harley Hahn