PDA

View Full Version : Retrieving data from an old Amstrad - help!



lizzyd
September 17th, 2005, 04:39 AM
Hi there,

I'm not sure if I'm in the right place, but I'd be really grateful if anyone out there could point me in the right direction with this..

My dad has loads of old letters and stuff that he wrote on his old Amstrad PC2086 D, dating from about 1990. They're all saved on 3.5 DS DD floppy discs. However when I try to open the files on my modern computer everything is jumbled up.

I can see the folders and all their names OK, but none of the files within them have proper file extensions. I've managed to open some of them by giving them the extension .txt, but then all the contents come out mixed up in a load of gobbledegook.

Eg, one of them looks like this:


DOC y I 0 x 0 A 4 x L a y o u t 0 x

L a y o u t 1 x

L a y o u t 2 x

L a y o u t 3 x

L a y o u t 4 x

L a y o u t 5 x

L a y o u t 6 x

L a y o u t 7 x

L a y o u t 8 x

L a y o u t 9 x

I x )L a y o u t 1 x


H


IVY COTTAGE
H

H
x From George Dolman 3/9/98
H

H
0 <<Uncle John Dolman lived in Post Office in Church Lane. This
H
??was known as Ivy Cottage. It is about 200 300 years old but is
H
now concreted over.
H

H
??Owned by Wyggeston Hopital estate. (Ithink theis refers to Ivy
H
cottage but may refer to farms.


Basically, I would love to get rid of all this gobbledegook and just turn them all into nice clean Word files, as part of my dad's 80th birthday present next month!

Does anyone have any idea how I might do this? If yes, I'd be extremely grateful for any help you could give me.

(by the way, I don't know if this might help or not, but when I just copied and pasted the above quote, lots of the characters I'm trying to get rid of disappeared - in my file every space appears as a kind of little upside down T-bar character)...

anyway, enough rambling - thanks for reading - it would be great to hear from anyone who could help!

Lizzyd

mbbrutman
September 17th, 2005, 06:29 AM
The general problem is that the modern program you are using to open the file doesn't know what is in the file. The file format is incompatible.

You will have to find a conversion program, or write one from scratch. Writing one from scratch isn't feasible anymore - the file formats are usually too complex and not documented. Occasionally you can get lucky by just stripping the 'unknown' characters out and converting it to straight ASCII.

Terry Yager
September 17th, 2005, 08:25 AM
The files were probably created by a word processor, and the garbage you're seeing is the "control characters" used by the program. Can you find out the name of the program used to write them in the first place? It might be possible to run that program and then convert the files to straight ASCII.

--T

lizzyd
September 17th, 2005, 08:53 AM
thanks both of you for the replies!

i'm afraid i don't know which word processor my dad used, and neither does he.

also i searched google for how to strip my code to ascii but still not sure if i understand how to do it. i'd be grateful for any pointers!

thanks very much again!

lizzyd

Terry Yager
September 17th, 2005, 09:11 AM
There are programs that can convert certain word processor files to another format. I used to use a program called UNWS (unwordstar) to convert wordstar .doc files to ASCII. Some modern word processors have the file conversion built-in. Unfortunately, without knowing the original file format, the conversion will be difficult (if not impossible). You might try by trial & error to use Word to convert the files, you might get lucky. Just try converting from all the different formats offered.

--T

lizzyd
September 17th, 2005, 09:57 AM
thanks very much again terry - will try that!

carlsson
September 17th, 2005, 01:48 PM
If you can upload one of the documents in ungobbled condition, some of us others may also help you find the right program. I'm wondering if it may be an Amstrad specific program, or just any IBM PC compatible application. Remember the PCW (personal computer word processor) series; although it is a completely different beast, it would not surprise me if Amstrad delivered word processing software with their later PC compatibles too.

Terry Yager
September 17th, 2005, 07:54 PM
Yeah, I was kinda thinking sum'n along those same lines, like mebbe a version of LocoScript ported to the PC platform.

--T

lizzyd
September 18th, 2005, 12:13 AM
Hi again,

thanks very much again guys - i've uploaded a typical gobbled file to:

http://www.hotelheaven.co.uk/DOLSEP98.P2

it's just a few jottings that my dad made for his local history group - it is readable but just extremely messy as you can see. i'm afraid i don't have an example of anything that's ungobbled though.

thanks very much for your help - it would be great if you do have any ideas, but don't spend too much time on me!

lizzyd

Terry Yager
September 18th, 2005, 08:37 AM
Lemme see if I can dig up a program that strips the eighth bit from a file, that might do the trick. I know there aere many such programs around, I just hafta find one.

--T

Terry Yager
September 18th, 2005, 08:59 AM
Here's a link to a $10.00 shareware program that looks like it might do the trick. It can strip the high bit, strip the control codes, or both (it has lots of other features too):

http://www.lgosys.com/products/stripper.html

I don't have the time to try it right now, but mebbe I can get to it later this evening.

--T

carlsson
September 18th, 2005, 01:56 PM
I downloaded the trial version of WordPort, which has a lot of obsolete word processor filters. It would not auto detect "P2" file, and most filters manually applied were refused by the software due to source file did not match. Some filters that went through:

Mass-11 (VAX/PC): only outputted control portion.
Multimate: outputted an empty message not counting the program's own disclaimer.
Nota Bene (DOS version): retained the original document w/ control characters. Ditto for the Windows filter.
PC Write 2.71+: outputted all the controls (like Mass-11), but none of the text.
Signature: same result as Nota Bene.
SpellBinder: almost the same result as Nota Bene.
VolksWriter: stripped out most, but not all control characters.
WordMarc Composer: empty message + disclaimer.
WordPerfect 4.0: empty message + disclaimer.
WordStar 3.x-4.0: stripped part of message.
WordStar 5.0-7.0: ditto as WS 3-4.
WordStar 2000: similar to SpellBinder.
XyWrite II, III, IV: similar to Nota Bene.

That is around 45-50 different word processors it is not written by.

I suppose the Amstrad PC in question is nowhere to be found or not working these days, and you only have the floppies left?

It seems the program uses CTRL+B for space and opens a new line with H followed by CTRL+K. Then follows five bytes of some formatting information, maybe horizontal centering on the current row.

Terry Yager
September 18th, 2005, 02:17 PM
That is around 45-50 different word processors it is not written by.

Yes, that's what we were afraid of, some very uncommon wordprocessor format <sigh!>.

--T

carlsson
September 18th, 2005, 02:58 PM
If only those programmers had decided to tag the document files with something more unique than "DOC" in the beginning of the file... :? Theoretically, something like the file utility on Unix environments could identify the file based on the header, but it requires someone to have identified the pattern before. At least I did not have any help from asking file. To google based on how part of the binary file format looks like is also not possible.

Some more DOS word processing software:

Breeze: Nope, does not match source file.
EasyWord: No free/trial download.
Timeworks WordWriter: No free/trial download.
Finally!: No free/trial download.

To write or use some software that replaces all the CTRL-B with space and remove everything else, and then do some hand editing is not difficult, but my idea was to identify the application and retain as much formatting as possible. Tough case, makes one feel almost like a detective on those TV series.

carlsson
September 18th, 2005, 03:25 PM
Two possible options:

1. Lizzy's dad may have used an early version of Ability (www.ability.com), an integrated package consisting of word processor, spreadsheet and database included with the Amstrad PC1512. I know this is an older model than 2086, but maybe the same package was still included.

Ability still offers DOS software to registered customers. I suppose you could contact them asking if the file looks like a document from their DOS days, and if so they can help you convert it.

2. Some PC1640's were also sold with Locoscript PC, but it would not run on other machines and probably was rare. Slightly less likely to have it on a 2086. Since I don't know how Locoscript PC files look like, I can't tell if this is one of them. Luxsoft however charge 5 (UKP) per MB to convert it to a modern file format. You may contact them and ask if the file is convertible (?) and if so, send all your dad's documents on one disk or e-mail.

http://www.luxsoft.demon.co.uk/lux/pcdconv.html

Update: Ability Write 4 does definitely not recognize the document, although it is said to load Ability version 1.2 documents. That leaves it with LocoScript, some 3rd party totally unknown word processor or that the file anyhow is corrupted.

mbbrutman
September 18th, 2005, 05:19 PM
I used the 'od' command on a Unix system to take a peek in the file.

There is a large header. Spaces look like they are represented by a hex 0x02.

A simple program that just turns off the high bits of the characters (like on an old WordStar file) won't do it. It's easy to get the text out of the file using something like strings, but trying to figure out the formatting is going to be painful without knowing what the sample file is supposed to look like.

Here is some sample output from 'strings':


Bill
Fairbrother
lived
further
44Thames
baker's
shop
where
Rose
cottage
now.
>>Leicester
Council
owned
farms
about
acres
each,
==designed


And here is was od thinks of part of the file:


000510 65 02 4c 69 6d 65 73 02 77 61 73 02 72 75 6e 02
e 002 L i m e s 002 w a s 002 r u n 002
000520 61 73 02 61 02 66 61 72 6d 02 69 6e 02 63 6f 6e
a s 002 a 002 f a r m 002 i n 002 c o n
000530 6e 65 63 74 69 6f 6e 02 77 69 74 68 02 53 6e 69
n e c t i o n 002 w i t h 002 S n i
000540 62 73 74 6f 6e 02 4c 6f 64 67 65 02 0f 01 0a 48
b s t o n 002 L o d g e 002 017 001 \n H

lizzyd
September 19th, 2005, 12:59 AM
crikey you guys are brilliant - thanks so much for all the info! have just got into work and found the flurry of posts. will look forward to having a proper read through when i get home tonight.

the old amstrad is, i think, still down in the cellar in my mum and dad's house, but that's 100 miles away and they're not there for the next month or so, so i may have to borrow a car one weekend and doing some investigation.

will get back to you asap anyway. Thanks so much again!!

lizzyd

carlsson
September 19th, 2005, 05:27 AM
It seems that "P2" may not be the default extention, but rather a way to say that this is page 2 of a longer document; the first line in clear text is:


INTERVIEW WITH GEORGE DOLMAN 3rd SEPTEMBER 1988 P2

Back to the control characters. Speaking in hexadecimal, every line seems to start with 48 0B followed by up to five bytes, the actual line and then ends with 0F 01 (soft line) or 0F 02 (hard line).

An empty line seems to be coded as 48 0B D0 05 0F 02.

The first line ("INTERVIEW.."): 48 0B F0 00 01 34 34 ... 0F 02
The second line ("The Limes.."): 48 0B 00 00 01 3E 3E ... 0F 01
The third line ("Farm until.."): 48 0B 48 03 01 1B 1B ... 0F 02
The fourth line is empty, see above.
The fifth line ("Brother John.."): 48 0B 58 02 01 25 25 ... 0F 02
The sixth line is empty.
The seventh line ("Uncle.."): 48 0B C0 00 01 36 36 ... 0F 01

The last two bytes (34, 3E, 1B, 25) happen to be the logical line length in hexadecimal, not sure why it is repeated other than in case of corrupted file. I can't figure out what the other three bytes do.

lizzyd
September 19th, 2005, 12:59 PM
wow - you lot are fantastic!! thanks so much again for all the input. i don't understand half of what you talking about but it all sounds great!

thanks so much for spending all this time on my little problem - you've really inspired me to get it cracked. so i've just booked a train ticket back home for the night on friday so i can see if i can find the old amstrad and boot it up to get some more info.

the luxsoft option looks pretty good too. i had spotted them when i was looking for a solution a couple or weeks ago but thought they'd be too expensive, because i was just thinking about the charge per-disc, very stupidly not realising that i could of course put all the files just on 1 cd and send that off. d'oh!

so that's definitely another option. anyway, i'll get home and find out what i can over the weekend, and also give luxsoft a ring for some more info, and i'll let you know what i find out next week.

thanks very much again you three for all your time - i really appreciate it!
will keep you posted!

lizzyd