I'm developing a C++
Date Class (GED-DATE
) in aid of parsing GEDCOM
files. What I need help on is collecting as many examples as possible from the 'wild'. This may include non-standard forms as well _DATE
comes to mind). What I want to be able to parse what is out there (thanks Mulder
). As an example in my dev directory I have:
For a total of 54358 lines of GEDCOM
data and 5969 lines of DATE
information. Hardly a sufficient sample, but a start.
I gathered my test file with a command line of:
C:\grep DATE *.ged > dates.txt
If those of you willing to help out would do the same for as many .ged files as you might have and either link or post the results in this discussion I would greatly appreciate it. In addition if you know of any links to additional .ged files that are available to the public links to those would be very helpful.
You might ask of what use would this be to BetterGedcom? Without getting into (hopefully) any politics it is my feeling that BG will need to handle legacy DATE forms in any attempt to move ahead. This would certainly include the material I'm looking for. A second consideration is that this will be open source and possibly of use to BG when it comes time to cut code. My plan is to be able to parse and convert any DATE form excepting DATE_PHRASE of course, and provide a date arithmetic (ranges, age, etc.) as well as a variety of output formats. I believe I'm not the only one who stared at +1 DATE PLUV 0012 without knowing that it really (Gregorian bias here...) was equivalent to +1 DATE FROM 20 JAN TO 18 FEB 1804 (I think I got that right :) )?
Also for those interested in the 'DATE
' problem, it would be quite useful to me to hear what people might like in such a creation or any other discussion on the subject.
(hsmyers atsign gmail dot com)
usual sig='s --hsm