Techniques on importing vertical column data into singular record sets.

Sample of original Data:

Number Kid's name foster parent bio parents (1-2)
1 Schmo, Joe Kid John Smith Eleanor Schmo
dob: 9/22/90 43 Oleander Blvd 99 Broadway
EntranceDate: 4/30/06 Brooklyn, NY 11389 NY, NY
AnotherDate: 7/7/05
Bill Schmo
49 Elm St
Grenwich, CT 89898

To distinguishing between foster and bio parent address fields,
I used Excel's autofil feature to quickly create delimiter columns:

Number Kid's name foster parent bio parents (1-2)
1 Schmo, Joe Kid FOSTER John Smith BIO Eleanor Schmo
dob: 9/22/90 FOSTER 43 Oleander Blvd BIO 99 Broadway
EntranceDate: 4/30/06 FOSTER Brooklyn, NY 11389 BIO NY, NY
AnotherDate: 7/7/05 FOSTER BIO
FOSTER BIO Bill Schmo
FOSTER BIO 49 Elm St
FOSTER BIO Grenwich, CT 89898
FOSTER BIO

Export to tab delimited:

1 TAB "Schmo, Joe Kid" TAB FOSTER TAB "John Smith" TAB BIO TAB "Eleanor Schmo"
TAB "dob: 9/22/90" TAB FOSTER TAB "43 Oleander Blvd" TAB BIO TAB "99 Broadway"
TAB "EntranceDate: 4/30/06" TAB FOSTER TAB "Brooklyn, NY 11389" TAB BIOTAB "NY, NY"
TAB "AnotherDate: 7/7/05" TAB FOSTER TAB TAB BIO TAB
TAB TAB FOSTER TAB TAB BIO TAB "Bill Schmo"
TAB TAB FOSTER TAB TAB BIO TAB "49 Elm St"
TAB TAB FOSTER TAB TAB BIO TAB "Grenwich, CT 89898"
TAB TAB FOSTER TAB TAB BIO TAB

Edit with Vim to remove superfluous end-of-line characters and predictably distinguish records from each other:

    1) Replace the last line of each set of data with an end-of-record cue. I chose "NEWRECORD" .
        :%s/^IFOSTER^I^IBIO^I$/NEWRECORD/c

    2) Remove all end-of-line characters:
        :%s/\n//c

    3) Restore end-of-line characters where records end:
        :%s/NEWRECORD/^M/gc

        (Although in mixed windows and linux environments end-of-line characters are tricky. I think that to create end-of-line characters to read on a windows system you want to use ^M. I think it's something else in unix. But I can't seem to recall exactly at the moment. Remember that to identify tab or enter characters in vim, you must hit CTRL-V and then either the tab or enter buttons. The text turns blue then.)



This is the end of what I know I can safely manipulate with vim. I *had* been working with re-importing the file into excel so I could easily drag fields into alignment under the appropriate column headers (foster Address1, bio Address2, etc). This part's a pain in the neck. I just can't safely predict how many fields are used and the alignment is terrible. There are all these extraneous notes everywhere. But once I have column headers established, it's simple to import into MS access or save as tab-delimited and insert with sql.

I suppose I could further implement Excel's autofill to also attach an increasing integer to give me an idea of what row I was on. (eg BIO1 BIO2, etc) Then I could insert BIO1 into the field for the primary biological parent's name...and not concern myself with whether it was the correct field or not.  This way I could push this job back onto them to move the data into the correct fields.