September 17th, 2015, 03:19 AM
We have some large .rar files on a windows server, containing some other pdf/xls or other kind of files inside, when we need to search for some files we must look inside all the rar files (hundreds) one by one. so.. .my question is, would it be possible to index the contents of these rar files and search within them off the index file so we can do directly to the rar file containing the file we are looking for?

thank you!

September 17th, 2015, 04:47 AM
CatDisk, by Rick Hillier will do just that, actually better. It creates a menu driven searchable database of all archives and their contents. It's a DOS program, however. But it runs just fine with DOSBox. You will need version 8.00 or later for RAR support.

September 17th, 2015, 08:18 AM
I know it's heresy to a modern Windows user, but look at the rar command-line interface. The '-l' switch will create a directory listing--and it has two optional modifiers. The "b" modifier, which gives just bare file names and the 't' modifier that gives a detailed "technical" listing. Thus:

rar l corruption.rar gives:

Archive corruption.rar

Name Size Packed Ratio Date Time Attr CRC Meth Ver
INSTALL.EXE 18860 18556 98% 31-05-95 05:44 ..R.... 568F85DD m3b 2.9
README.TXT 2396 2396 100% 08-06-95 15:43 ..R.... 3FB812CA m0b 2.9
2 21256 20952 98%

rar lt corruption.rar give:

Archive corruption.rar

Name Size Packed Ratio Date Time Attr CRC Meth Ver
Host OS Solid Old
INSTALL.EXE 18860 18556 98% 31-05-95 05:44 ..R.... 568F85DD m3b 2.9
Windows No No
README.TXT 2396 2396 100% 08-06-95 15:43 ..R.... 3FB812CA m0b 2.9
Windows No No
2 21256 20952 98%

and rar lb corruption.rar gives:


The CLI interface to RAR has many more options than that of the GUI. Redirect any of these to a text file and Bob's your uncle.

September 17th, 2015, 12:05 PM
The CLI interface for RAR is quite powerful. I long ago wrote a batch file using RAR that gives me a 25% recovery volume for daily back-ups of the office customer database, then it uploads to on-site NAS storage. It wouldn't be (easily) possible to do what that batch file does without RAR's CLI interface. It's well worth getting to know.

September 17th, 2015, 12:43 PM
Yeah, this definitely seems like a case for taking the *nix chain-of-small-utilities approach - if you're running post-9x Windows, cmd.exe should provide most of the glue you'd need to set up a batch file using the rar command-line interface to dump listings to a parallel set of text files, which you can then use a utility like WinGrep (http://www.wingrep.com/) on. Probably quicker and easier than searching for one particular archive utility with just the right combination of features.

September 23rd, 2015, 10:52 AM
November 20th, 2015, 01:49 AM
In the old times of CD backups, i got all my cds indexed with a very good program called Whereisit

today they keep improving it, and has millions (not exaggerating) of options, server editions, plugin supports, blablabla...

November 20th, 2015, 07:00 AM
index the contents of these rar files

That's the part most of these replies are missing. They don't want to index the list of PDFs, they want to index each PDF's contents.

Our solution at my place was to simply store the PDFs uncompressed, so that Adobe Acrobat and other programs can index them as-is.

If your PDF files compress so well that RAR significantly reduces their size, I'd argue you are not optimizing your PDFs very well.