BetterGedcom - Evaluation Of Existing Data Models

greglamberson 2010-11-29T03:30:59-08:00

Personal concerns

I'm not done researching data models myself and so do not feel ready to truly start evaluating the data models various folks have presented. However, here are my impressions and concerns generally so far:

1. Balancing Long-term vision and short-term needs - I don't want to commit to a short-term, incremental or stop-gap solution for today's realities without understanding what the future direction is. That said, I don't know that it's possible to immediately advocate for a data model that supports the research process (i.e., embraces evidence, deliberations, analysis, etc). We must understand where we're going first and have a plan on how to get there from where we are.
2. Embrace of a "scholarly genealogy" approach - Rather than just adding some places in a data model for more analysis, I feel we have to explicitly change the focus of genealogy data from the conclusions we have in today's software to an evidence-centric model. Such an approach is more truly in keeping with today's best practices. Many people advocate the need for more explicit evidence, but these approaches are mostly supplements to a conclusion model. Anyway, that's my opinion right this minute.

These are my 2 biggest concerns at the moment.

hrworth 2010-11-29T04:18:22-08:00

Greg,

I am not sure there is a rush on any of this. That is NOT to say that what we are doing is not important, because it is.

How long have we been working on this project?

How long has the current GEDCOM been "broken"?

The issue, for me at this time, is that we don't have all of the right players at the Wiki Table.

We need software developers and we need the Scholarly Genealogists at the table.

As an End User of one program, GeneJ as End User of another program are only two users, and how many other computers based programs are there, not to mention web based applications.

You technical folks are doing an awesome job of "putting stuff out there", but I think we have a long way to go before any decisions are made.

In my opinion, we need to take our time, weight the positive and negative effects of what we are about, with everyone at the table,

My 2 cents on this.

Russ

greglamberson 2010-11-29T11:11:49-08:00

Well, I don't think I was implying there's any particular rush on things. Far from it.

So far what I've seen on the technical side is a lot of talk on technical things that make assumptions on where we're headed that I don't think are necessarily correct. Regarding having the right people at the table, frankly, if we haven't even decided where we are headed yet, we're not in any danger of trying to get there without any particular point of view at the table.

Also, this is going to be a renegotiation at every step of the way. This is not a strictly linear process. As we move further along and more people become involved, we're going to end up going over every assumption over and over again. This is going to be particularly true once we get to the stage of having a basic concept in mind.

louiskessler 2010-12-22T22:57:56-08:00

We Need To Compare The Various Models

We have Tom's DeadEnds data model, that he is very happy with. We have Mike's SFT data model that he is very happy with.

We've got the original GEDCOM, which is better than most people think, because it has many features that they don't know about. There is no need to re-write what works.

We've got the GEDCOM 6.0 XML Draft which was a valid first attempt by the LDS people to update GEDCOM.

Then we've got GRAMPS, GenTech and a host of other models.

We can argue concepts until we are blue in the face. But until the models are compared right down to the individual Entity/Element/Tag level and translated back to a common form (e.g. GEDCOMish), possibly with concrete examples to make everything clear, then I'm afraid we won't be able to objectively decide what's best and what's not.

Some proper evaluation will need to be done.

Suggestions?

mstransky 2010-12-23T06:49:16-08:00

"We can argue concepts until we are blue in the face" -Louis
So true, I would not try to make anyone change what may work for them or myself. But I think GEDCOMish as the stepping stone format should be used. Even though I am 100% xml, I still consider the many apps out there that rely on a gedcom model.

I have already started to match my tags to the GEDCOM tags. But the many various custom tags that many Apps have created may and are still unaccounted for.

For that small <2% that can not be sync, I suggested a universal CATCHall tag that stores the original exporting tag and data into a note like field seprated by "/" or ":".

Then the importing App will allow a user to view errored or unrecognized tags and the original purpose by the "Tag" in front of the data. This might suffice for a short while while doing import export test for what may never be 100%.

But yes I agree those knowledgeable of the Gedcom 5.5 and 5.5.1 could start a list of MAJOR tags, then minor (not used a much lists)

Than apps only make minor export changes to the GEDCOMish GEDBG structure + CATCHall tag.

Then I am sure anumber of apps can start import and export files. A russ would have a ball saying what the heck is "XYXY" tag? and work trying to better sync tags to a main ideal communications.

"Some proper evaluation will need to be done."
-Louis
By having a (GEDCOMish"BG") tag list, it is up to the APP DEVS to import and export to it. this will be a universal stepping point for all. That list can act as a sync list and an explination to an outside person looking in at anothers model IF that app dev provides it.

If each MODEL provides a sync list match to a BG list that is how we can compare the various models datafields that they are handling or capturing outside of the norm or find an equvilant for them.

ttwetmore 2010-12-23T08:57:01-08:00

I believe we should add Event GEDCOM to the list. It was a CommSoft Proposal from 1994. It included some excellent modifications to lineage-linked GEDCOM, including both Events and Places as record level objects. Here is a link to the proposal.

http://deadendssoftware.com/EventGEDCOMDraft1.0.pdf

Given that it is now 16 years later one might say they were ahead of their time.

Any comparison of models should start with a list and definitions of the records supported by each model, and some way to contrast the different definitions.

Tom Wetmore

mstransky 2010-12-23T09:35:35-08:00

Wow thank is incrediable. That is what I am doing. instead of placeing all records inside the INDI, all my EID link out to all universal records. Also th eperson relationships are preseverd via my PID and Family/GroupID table.

Maybe I have not worded my concept very well, but you can read that pdf Tom added and that is the basis In made my model to do.
please note I am not saying his is the way to go, it is just that is how I made my model to store data. If others adopt the EVENT and places as level records you will find that storage of data become more flexiable and compact.

So yes "I am in favor" on that motion if more would adopt that concept.

GRAMPS Data Model Evaluation

Comments