Home> GEDCOM X

GEDCOM X is being designed to be an open standard for genealogical data communications.

The following information is taken from the GEDCOM X About page:


A Long Time Coming

A lot of things have evolved with genealogical technology since the original GEDCOM format was specified. Genealogical applications aren't just about making conclusions anymore. Indeed, a much more sound research philosophy focuses more on records and evidence than it does on making conclusions. Furthermore, with the advent of powerful search engines, software as a service (SaaS) offerings, and social networking applications, the legacy GEDCOM model just isn't going to cut it anymore.

Up through 2010, FamilySearch had been busy with other important things, like online access to their huge collection of records. But around 2010 a lot of notable events--including the sprouting of some impressive standardization efforts--came together to raise the priority of a new GEDCOM. By the end of RootsTech 2011, it became clear that the community needed something new...

Project Scope

GEDOCM X has a much broader scope than did legacy GEDCOM. The scope of legacy GEDCOM was primarily limited to allowing users to make conclusions about genealogical information, and provided only superficial support for citing evidence and sources. Legacy GEDCOM was primarily designed to be saved as a file to the hard drive of an isolated desktop computer and never considered the needs of other data providers, like an online web application.
GEDCOM X is designed to continue to meet the requirements accounted by legacy GEDCOM. This means GEDCOM X will provide support for genealogical conclusion data and a file format. But GEDCOM X expands on the original scope of legacy GEDCOM to include concepts such as:


The scope of GEDCOM X also includes support for standardization of the APIs that can be used to work with genealogically relevant data. These interfaces are based on the same principles that made the World Wide Web a success and provide a industry-standard way to do things like:

Models and Profiles

GEDCOM X is neatly partitioned in such a way so as to allow developers to easily use the pieces they need without having to swallow the entirety of the specification. The data is divided into different Data Models that define the genealogical data types and their properties.

But the GEDCOM X specification defines not only the data models used to describe genealogical data, but it also defines a set of APIs that describe standard operations on genealogical resources. The API specifications are divided into different Application Profiles that are intended to address specific sets of well-defined requirements and use cases.

To read about the different GEDCOM X data models, see the data model documentation.

To read about the application profiles, try starting with the developer's guide.

Here for the Long Term

GEDCOM X is designed for the long-term. Through solid design principlesand active community support, GEDCOM X is the standard mechanism to establish a rich and collaborative environment for the noble work of genealogical research.



Links to GEDCOM X pages


gedcomx.org - the project page
gedcomx.net - the community page
github.com/FamilySearch/gedcomx - the project repository
familysearch.github.com/gedcomx/atom.xml - the project blog (atom feed)
github.com/FamilySearch/gedcomx/wiki - the project wiki
github.com/FamilySearch/gedcomx/issues - the project issue tracker

Links to Articles about GEDCOM X


Glimpses of GEDCOM X - Randy Seaver, 2012 02 07
Ryan Heaton: A New GEDCOM - the Ancestry Insider, 2012 02 04
FamilySearch releases GEDCOM X - Tamura Jones, 2012 02 02
GEDCOM X - Tamura Jones, 2011 12 12

Comments

ACProctor 2012-02-08T05:40:09-08:00
W3CDTF Terms
There is a profound weakness in the W3CDTF terms used by GEDCOM-X.

These rely on the W3C encoding for dates and times, which is turn relies on specific locale-neutral date forms from the ISO 8601 standard. See http://www.w3.org/TR/NOTE-datetime.

The problem for genealogical usage is that many registrations occur on a quarterly basis. In order to represent such registration dates accurately the notation must be capable of addressing yearly quarters too. This is a glaring omission in ISO 8601 and directly affects all genealogical and family-history usage.

STEMMA was aware of this shortcoming and introduced the date form yyyy-Qd (e.g. 1956-Q2) which is both compatible-with and entirely in-keeping-with the existing ISO date standard. See http://www.parallaxview.co/familyhistorydata/home/document-structure/event/dates.

FHISO are currently talking to ISO/TC 154 about incorporating this format in a revision of the ISO 8601 standard.

Tony
ACProctor 2012-02-08T10:21:27-08:00
That's a good point Adrian. STEMMA bottled-out there and simply left a way of providing a calendar name.

In general, there is no algorithmic way of converting dates between all the calendars used worldwide now and in the past. This means the stored computer-readable date has to be stored in its original calendar.

What is needed - in addition to a calendar-name property - is an equivalent to ISO 8601 for each calendar. That must provide locale-neutral fields (i.e. numeric only) in sorting order. This is what the W3C (& STEMMA) representations use for their Gregorian dates.

Unfortunately, this requirement is not of mainstream interest and so I'm not aware of any such standard. I am hoping FHISO can do something here in conjunction with ISO, when it gains a bit more weight that is.

Just for clarification: GEDCOM-X cannot "fix" this issue. It could work around the deficiency, as did STEMMA, but that's not the right move. FHISO are trying to achieve a revision to ISO 8601 which would then percolate down to the W3C date/time representations.

Tony
louiskessler 2012-02-08T11:35:21-08:00

Tony,

At Ryan's presentations at RootsTech, he did specifically state that any concerns and issues regarding GEDCOM X should be placed on the project issue tracker at github:
https://github.com/FamilySearch/gedcomx/issues

He says that's where issues re GEDCOM X will be addressed and discussed.

By the way, at Geir's request, I've created a page here on the BetterGEDCOM wiki with introductory info about GEDCOM X. You'll find it under "Data Models" at:
http://bettergedcom.wikispaces.com/GEDCOMX

Louis
louiskessler 2012-02-08T11:45:25-08:00
Look at that. I didn't realize you already saw the page. That's obvious now to me because this discussion post is attached to it.

Louis
GeneJ 2012-02-08T19:36:50-08:00
Tony, Adrian,

Have either of you read some of the US Library of Congress work on extended date/time standards.

Here's a link to the "Extended Date/Time Format" standards webpage. You can link to the Draft Specification from there.

http://www.loc.gov/standards/datetime/
ACProctor 2012-02-09T02:00:39-08:00
Thanks for the link Gene. I wasn't aware of that work.

IMHO, it's a pretty horrible specification that mixes-up a whole bunch of different issues such as ranges, precision, uncertainty.

It makes no reference to alternative calendars and so is not relevant to the exchange between myself and Adrian.

The most interesting part was years > 4 digits but I already had a planned (2nd) proposal for ISO to incorporate this as a calendar extension of the 8601 standard.

Tony
ttwetmore 2012-02-09T03:04:49-08:00

And we can consider the DeadEnds approach to dates, via the link to its description on the DeadEnds model page. Or with direct link:

http://bartonstreet.com/deadends/DateFormats.pdf

This approach has one wonderful attribute beyond that of being fully computer processable -- it is also human readable and writeable because, gosh darn it, it's human understandable, and, glory be, it's exactly what you'd write naturally anyway.

Of course, because it must, it also handles the uncertainties, the ranges, the downright awkwardnesses that can be present in genealogical dates.

I once again strongly question the wisdom of applying strict, restricted, standards to what is a sloppy, humanistic data domain. I've been blowing this horn for decades, and slowly loosing the battle to the young geeks. Odd to say as I am among the geekier of the geeks. But I'm a wise old geek.
AdrianB38 2012-02-09T09:30:48-08:00
Tom - do you have any insight into how the GEDCOMX guys imagine the original and formal values for date should be handled? I can find nothing - but I did just spend rather too long looping round their documents and back to the same page trying to find the answer, so it may be my fault I can't find the answer.

My initial thought was that having original and formal values would seem a sensible way of combining your view (with which I am much in sympathy) and Tony's - the original could contain the sloppy human text and the formal would contain a "proper" date _where_possible_ (e.g. "a year after the Norman Conquest" does NOT equate to 1067 in my book - 1067 +/- 1y would be closer to the truth). My suspicion is that in many cases, such a conversation will simply not be possible.

I have no idea whether this failure to map causes the GEDCOMX people a concern or not. Anyone any idea what their expectation is????
ACProctor 2012-02-09T09:36:16-08:00
I hadn't noticed the original might not be present Adrian.

"Tony's view" (for what it's worth) is that both are required. STEMMA has a DATA_ATTRIBUTE called 'Original' for exactly this purpose.

Tony
ttwetmore 2012-02-09T14:15:32-08:00
Adrian,

I don't have any insight on the GEDCOMX date format. I've followed the loops in the documents also. It seems a Date is a Field, a Date is also composed of DateParts, and DateParts are also Fields. That seems to be the extent of information now available. I would expect that there will be a format forthcoming that defines the legal values that the contents of those fields may have. It may even exist and we may just be unable to find it yet!

I agree with you and Tony that both an original form and a processed form may be necessary in some situations. I don't go as far down the road as Tony does in proscribing the format of the processed forms to strict standards. I must admit to thinking of the "processed" form more as a sorting key than as an actual date. That is, I tend to add the processed form to a record in my database only when the original form is not adequate for my (fairly sophisticated) date parser to be able to figure it well enough to use is as a sort key.

Tom
nick-mat 2012-02-10T06:04:19-08:00
Hi, I hope you will forgive a newcomer jumping in with both feet, but dates and calendars are an interest of mine (they allow me to get back to programming basics).

TFP (The Family Pack - my program and database effort http://thefamilypack.org ) stores dates as a single integer, with a range to indicate uncertainly. This has a number of advantages, it simplifies sorting and calculations, it is culturally neutral and it can be displayed in any format you like (provided you have an algorithm for it).

Re Tony's comment on converting dates between all the calendars used worldwide now and in the past. I don't believe the picture is as bleak as you paint it - provided you stick to the period of time the calendar was in official use then the available algorithms are pretty good and AFAIK are reversible, meaning you can convert back and forth between different systems with no loss of data.

Probably the most difficult aspect of historical dates is the variable year change. For instance, sometime after the twelfth century England began to use the 25th of March as the start of the new year, this continued until 1752 and the change to the Gregorian calendar. In Scotland however, the new year has always been the 1st of January (the Scottish celebration of hogmanay goes back a long way). Other European countries have had different year starts. The TFP solution to this is to have Local Calendars which can account for both the local year change and the change over from Julian to Gregorian calendars. This gives us the English Calendar and Scottish Calendar etc as options for describing a given date.

The danger in this approach is that it can lead to a false sense of precision, but I think it important to be clear about what is being shown on a particular document, even if we later go on to round things off in our conclusions.

Nick
ACProctor 2012-02-10T06:21:40-08:00
Thanks Nick. Useful notes

Re: Coversions, I'm talking about the general case. Usually, the issues arise between Christian and non-Christian calendars.

Here's one such argument that was presented to me on a genealogy forum:-

"...even today the Hindu calendar is not merely an ancient calendar, it's in common usage -- Government of
India documents carry three dates : the Christian date, and two disparate Hindu dates. For the past 50-few years, that's provided at least an officially correspondant dating. Prior to that ... unreliable for extrapolation or even interpolation".

Otherwise, I entirely agree that the "stored date" (as opposed to the original textual version) must have an associated calendar property.

Tony
nick-mat 2012-02-10T09:24:14-08:00
The these conversions are unreliable when they rely on complex astronomical observations and calculations. There's no guarantee that they came up with the same answer as we do, since we are working with different data. But we are only talking a day or two at most. Taking the pragmatic approach, if we enter a Hindu date and convert it to an integer for storage then convert it back to a Hindu date for display, provided it's the same as we started there's no problem. There may occasionally be a problem entering a date which the system thinks is illegal - the equivalent of trying to enter 29 Feb 2011. This could be intended to mean the last day of February or the first day of March. The safest way is code it as both ie a range of two days.

It's difficult to see where these possible small differences would actually become a problem. Provided you document what's going on, it's probably the best we can do.

Nick
heatonra 2012-02-08T08:03:28-08:00
Excellent observation, Tony. Why don't you open up an issue on Github so we can discuss and get that fixed?
ACProctor 2012-02-08T08:15:55-08:00
Thanks Ryan. I'm not currently a Github user though.

Can you confirm what you would expect this to achieve? Is it a matter of gaining support for the proposed change?

Tony
AdrianB38 2012-02-08T10:01:38-08:00
Is it also appropriate to ask at this point about 2 further date issues? Specifically
- what about dates represented originally in non-UTC calendars (e.g. French Revolutionary dates, English regnal years, lots of other non-Western calendars)
- what about Julian dates? The date 8 February 1750 is ambiguous. If it's a date copied exactly from a source in England, then it should be represented as 8 February 1750/51 or 8 February 1750 OS or 8 February 1751 NS to resolve the ambiguity. (If it's an interpretation of such a date with none of that stuff, then it truly is ambiguous). BUT it's still a Julian date and would not map to the Gregorian(?) date 1751 02 08 UTC.

If it's a date copied exactly from a source in France then I _think_ it's not ambiguous because by that time France had moved onto a New Year starting 1 Jan and the Gregorian calendar but dates centuries earlier in France would have the same issue.

I suspect the first issue can be resolved by switching to UTC for the formal representation but needing to record the textual version in the original format.

The 2nd baffles me somewhat because I've no idea what UTC means taken back that far in history to when Julian dates could be found AND when the New Year was not 1st Jan (which is NOT the same issue - just normally linked).

Tell me if I'm jumping the gun here, but Tony's point rung bells.

Adrian
ttwetmore 2012-02-08T17:48:55-08:00
GEDCOMX and Personas
Quoting from the GEDCOMX web page:

The process of making the contents of the source available is called extraction. For example, if the source is a census that lists a John Smith born January 1, 1880 then the result of extracting the source will be a piece of structured digital data called a record that specifies a PERSONA with a name "John Smith" and a birth fact with "January 1, 1880" as the text of the date.

I have capitalized persona to make it more obvious. I have been preaching the same concept being expressed in this paragraph for years, as a necessary addition to our genealogical data models, in order to bring them up to the level required to support research and record level genealogical work.

I have repeatedly asked the question, How do you want to record your evidence in your genealogical databases? The answer to me has always be crystal clear -- as persona records. This is clearly the form that all data providers (such as Ancestry.com and FamilySearch) can best support, and clearly the best form for all genealogical algorithms to work with.

Thanks to GEDCOMX for understanding this fundamental need of the next generation of genealogical software.
testuser42 2012-02-15T03:18:25-08:00
My opinions
Now that I've read a bit more about GedcomX over at their Github HQ, I want to write down my impressions so far. The good, the bad and the ugly? ;)

First, about the overall project:

- Very good: The team around GedcomX seem genuine in opening up to opinions from outside, they really seem to want to do this right. There are many people at the Github pages involved in very good discussions. Ryan does a great job as a moderator.

- Also very good: the people there come from various backgrounds, it's not FS staff only.
International participation could be stronger, I think an active effort needs to be made to get voices from non-english-speaking countries.
"Layman users" aren't there, which IMHO is good for serious and focused work on the issues. The user perspective could be an area where BetterGedcom and the blogging community could help, in case some things might be forgotten.

- Things that need more clarification: How independent or dependent is GedcomX really?
In a Feed post dated 2011-12-08, Ryan wrote about a decision at Family Search:


That sounds good. It will have to be seen how open the standard will be, and how tough decisions will be made. Up to now, I think Ryan does his best to move the work along while not being an evil dictator. Backing by FS is great, it gives a lot of clout and power. And of course, a new Gedcom needs to fulfill all the needs that FS has. But a true standard will fulfill these needs in a neutral way, while equally fulfilling all the needs that other users might have. That's why I like the last sentence in the quote above.
FS should not try to "own" the project, this will scare others away. Maybe FHISO could be a independent organisation involved in the effort, to put some fears to rest?
testuser42 2012-02-15T11:02:00-08:00
A thing I forgot to mention: What's the use of MIME as a File-Format? http://www.gedcomx.org/File-Format.html
Do they really want to have JPG, MP3 or AVI files inline?? I think thats very silly, to put it mildly.

re Adrian:
re 2-sub models:
yeah, that's probably what they mean. But it looks as if they discussed changes in one model that don't happen in the other? hmm.
The data isn't so complex in my mind that people need to look at 2 sub-models. The data in the records (=Evidence) is going to be "recylcled" for the Conclusions anyhow. It's not far apart.

re Record entity may be the Source:
It's definitely the most similar thing...

re Relationships:
I had to google UML symbols too, and wouldn't claim to be sure of them. The numbers at the links are easier to understand. I don't get the "GenealogicalResource" and "GenalogicalEntity" boxes, too. They seem to link the Records and the Conclusions, but what are they?
The problem of unreadable UML and missing Text has been mentioned https://github.com/FamilySearch/gedcomx/issues/114
A good thread about the Relationship issue: https://github.com/FamilySearch/gedcomx/issues/7

re Names and Dates:
Good point! I think they forgot that. Come to think of it, nearly everything (even gender) would need a date. Facts have Dates, but the vital facts seem to miss them.

re Family:
Yes, they'll do away with the "Family" record. That's the better way for the future IMHO, but the point you make is valid. I guess FS must have dealt with that already in their database. Is it really so often that GEDCOM has things attached to FAM?
testuser42 2012-02-15T11:06:29-08:00
Another find:
https://github.com/FamilySearch/gedcomx/wiki/DeadEnds-Mapping
A bit vague, and probably not up to date.

Tom has a Github account and is taking part in the discussions there. Maybe he (or Ryan, if he finds the time) can put our questions up there? Or should we get accounts? I don't really want to, since I'm in no way a developer and can't follow the finer points. Maybe they need their own forum for User questions?
louiskessler 2012-02-15T19:53:38-08:00
The one thing I can say from Ryan's presentation is re MIME format:

It is simply meant in GEDCOM X for embedding objects (like images, sound files, videos) the same way MIME is used for embedding HTML in email.

Louis
louiskessler 2012-02-15T20:36:53-08:00
Oh, I reread your post above and I see you knew that.

Yes. They do want to embed the objects in the format. It was even mentioned you might have files gigabytes in size. But they'd sooner have the data with it.

We'll see how that goes.

My idea is to have two files. One with the data and one with the objects.

Louis
WesleyJohnston 2012-02-15T21:14:52-08:00
This goes back to testuser42's links for the place discussion. The GOV seems a good step, but I do not see how it is being managed ... and I saw a problem with the very first location that I tried. I entered "slabce", and it properly shows that it is in Central Bohemia. But it completely omits the extremely important fact that it is in okres (similar to US county) Rakovnik. Ultimately all of these standards will deal with all these sorts of issues, but they are very uneven right now. I really think FHISO should ultimately be the source for all standards, including places.
ttwetmore 2012-02-15T23:16:36-08:00
Tom has a Github account and is taking part in the discussions there. Maybe he (or Ryan, if he finds the time) can put our questions up there? Or should we get accounts? I don't really want to, since I'm in no way a developer and can't follow the finer points. Maybe they need their own forum for User questions?

GEDCOMX should be open for discussion by anyone. If you want to ask questions or add to the discussion there you should get involved. Though flattered by your suggestion that I could relay BetterGEDCOM questions to GEDCOMX, I don't think that is appropriate.

I have made a number of comments about GEDCOMX in their issue lists. There is more technical activity going on there than at BetterGEDCOM right now, so anyone interested in what's going on with genealogical data models should also be tracking activity there as well as on the BetterGEDCOM wiki.

Though there has been resistance to the persona concept at BetterGEDCOM, it is a key concept in the GEDCOMX Record Model. This in my mind places the usability of the GEDCOMX model ahead of any BetterGEDCOM model that would not include it. However, there is no BetterGEDCOM model, and in my opinion, little prospect to be one for years. So arguing about the pros and cons of the persona concept in the BetterGEDCOM context, seeing that it could be years to resolve, is futile. Regarding GEDCOMX I cannot say that I am happy with a model that will essentially be dictated by FamilySearch, but I can say that it is satisfying to be working with a project with some prospect of completion and ultimate use.

I am against the idea of separating the GEDCOMX model into a Record Model and Conclusion Model. That was one of my very first critiques of GEDCOMX when I was invited to comment a number of months ago. I am pleased to see a growing ground swell of agreement to the views I expressed. By getting rid of the distinction one moves directly into the possibility of using N-tier person groups for handling evidence and conclusions in a cohesive manner, and I hope GEDCOMX continues in that direction.
AdrianB38 2012-02-16T02:42:01-08:00
"Is it really so often that GEDCOM has things attached to FAM?"

Klemens - two answers to that. Firstly, I make extensive use of a custom attribute for Family Residence. It's not ideal since it doesn't define which of the children are resident (I have to write that in the note), but it's the least worst solution I have found so far for avoiding duplication.) OK, so that's at least one person uses it! (grin).

Secondly, if it is possible that stuff is defined against a family in GEDCOM, then for conversion's sake it must be possible to convert it forward. No matter how rare its use. Now, since Relationships can have facts in GEDCOMX, it's perfectly possible to map Family stuff to the relationship between the two spouses, but my issue with that, is that it's put a different spin on matters because an attribute of a family is a subtly different thing from an attribute of a relationship between two spouses. Starting with the expectation of how many people are involved.

Ideally what's required is a multi-person fact (e.g. multi-person event).

NB - odd they accept the idea of embedding objects - it was removed from the draft for GEDCOM 5.5.1! Just adds to the impression of a cleaner break than I expected.

Adrian
ttwetmore 2012-02-16T08:06:01-08:00
I don't get the "GenealogicalResource" and "GenalogicalEntity" boxes, too.

I believe that they are super classes only, a means of abstracting out any commonalities shared by their sub-classes.
heatonra 2012-02-16T08:21:21-08:00
Hey guys.

I'll try to take on your questions by updating the docs and adding comments here. I'm a bit buried at the moment, so you'll have to be patient.

If someone would be willing to summarize the questions in a succinct bullet list, that might help me get to them faster.
gthorud 2012-02-16T16:33:17-08:00
About "GenealogicalResource" and "GenalogicalEntity". My assumtion is that they are superclasses with attributes that are inherited by the subclasses, so e.g. Person will inherit the attributes of those 2 classes. But I may be wrong.

If i am right, I think it would have been better to have a diagram that would spell out all the attributes for an entity - it can't be that difficult to have groups showing the inherited attributes inside an entity - and just explain that some of these groups are inherited. It would also simplify the diagrams substantially.
AdrianB38 2012-02-17T05:11:41-08:00
Ryan - thanks for watching. In response to your request, here's my summary of SOME points:
1. What is the effect of separating the Records model from a Conclusion model? Are they separate areas within one overall? Or wholly separate? Does the answer change depending on whether we talk about models or implementation?

2. Please explain what happened to the entity "source" from GEDCOM. Where do things like transcriptions of sources go?

3. Does the Persona allow a missing age? (It should)

4. Can the model have a Relationship with no Facts? (Sometimes the only info will be that there is a Relationship of a certain type)

5. Conclusion model - this must allow more than one value for gender, with dates.

6. Please explain what the "Conclusion" does in the Conclusion model.

7. Can relationships in GEDCOMX accomodate 1 to many and many to many?

8. The current documentation seems to require a knowledge of genealogy, specific varieties of UML and XML/JSON. There is a RISK that the number of people with these abilities is so low as to invalidate FS's desire to involve others. Possible MITIGATION - create a Requirements Catalogue in English plus whatever other languages are possible that will allow comment by a wider community.

8.1. Also - a Requirements Catalogue would make it clear what specifics the model is intended to solve.

9. Conclusion model - this needs to allow dates against Names (both primary and alternate). There is probably no need for a Place against Names but there is a need for some sort of description - e.g. Primary name = Archibald Alexander Leach, type = registered-name; Alternate name = Cary Grant, type = stage-name, Date = from ????

9.1 Is there any definition of Primary versus Alternate names?

10. Please explain how values currently held in GEDCOM against the Family entity can be mapped to GEDCOMX. In particular, please explain how custom attributes and events such as Family-Residence can be mapped.

10.1 Suggestion - multi-person facts could be used for item 10.

11. Please justify embedding rather than linking of multimedia.

Note for BG Wiki members - this is NOT a full list of the issues raised here. Please check the list above against the posts here and correct errors and missing bits below. Any errors of misrepresentation are my fault. (PS - it would be sensible to let GEDCOMX answer these points first, rather than add still others....)

Specific question for Klemens
Re: "I'm really hoping they will allow "first-class" Events with roles and have a "first-class" Place Structure. " (testuser42) - not sure what "first-class" means here....
testuser42 2012-02-18T06:27:58-08:00
Re "first-class" Event and Place:
I was using Tom's term for a top-level Record that can stand on its own, one that in GEDCOM would have a zero as its level number. So I want to allow something like

0 PLACE
1 ...
Discussed here: https://github.com/FamilySearch/gedcomx/issues/79
Place Records should allow a hierarchy.
IMHO there would be no need to turn every mention of a place into a Place Record. It should still be possible to have the name of a place just as a tag with an event or fact, like GEDCOM usually does.


0 EVENT
1 ...
Partially discussed here: https://github.com/FamilySearch/gedcomx/issues/118
The Event Record should allow multiple Roles (eg links to Person(a)s that took part or are somehow connected)
testuser42 2012-02-15T04:10:27-08:00
Now some opinions about the GEDCOM X models as of now:

- I'm not a fan of separating a Records model from a Conclusion model. I don't see the use for that. Of course, Records (=Source Data and Evidence) need to be kept separate from the conclusions, but you can do this with just one model.

- I'm really hoping they will allow "first-class" Events with roles and have a "first-class" Place Structure.

- I think the Record model could need a Source entity. I would want to keep a transcription of my physical sources there, as well as anything that concerns the physical source in its entirety. Right now, the Record model asks you to interpret your physical source and at once divide up the bits of evidence that you find.

- The record model diagram gives
Persona 1->1 Age
It should be 0->1 (you may not find an age in your source)

- Also there:
Relationship 1-> * Fact
I think it should be 0-> * because sometimes the only info will be that there is a Relationship of a certain type.


- Conclusion Model: A Person could have more than one gender (but hopefully not more than two) over time. Rare, but possible...

- What does the "Conclusion" do in the Conclusion model? It's not clear right now.

- Relationships: We had a lot of good discussion on that in this wiki... GEDCOM X right now only as a relationship for two Persons. That's fine most of the times, but maybe we might need 1->* relationships? Additionally, we need something like an Event with multiple Roles.
Also, I'm not sure a Relationship needs to be an Entity of its own. A simple link from inside a Person Record to another Person Record seems to be OK.


- About the Common Model:
This seems where (international) Users could contribute: What NameTypes do we need? e.g. I'm missing a nameType "Rufname". What relationshipTypes or placeTypes are missing?

- One concern: The text-to information ratio is very bad in all the code, because of all the namespace etc fluff. It makes it hard to see what's really happening.


- About the Place discussion at https://github.com/FamilySearch/gedcomx/issues/79
I'd like to point out
http://gov.genealogy.net/search
http://wiki-de.genealogy.net/GOV
since this is a place database that addresses many of the issues that are discussed.
AdrianB38 2012-02-15T09:18:00-08:00
"I'm not a fan of separating a Records model from a Conclusion model"
I read that as being more like 2 sub-models within one overall model. But that may simply be me expecting to see that sort of construction, which allows people to concentrate on one area and not be distracted by a data model of the full thing, which is invariably illegible if it's a real-life "thing", simply because real life is complex.

"I think the Record model could need a Source entity." I think that's the Record entity in the Record Model. But where the "normal" stuff like transcript goes is anyone's guess.

Re optionality of the relationships - I've truly no idea what the diagram says because I have no real understanding of this type of UML diagramming. According to Wikipedia, those links indicate aggregation relationships, not association relationships. No, I've no idea either.

Basically, to pick up on the comment about the "text-to information ratio", something is going to have to give if GEDCOMX want comments from outside. BetterGEDCOM has enough issues when Data Modellers try to talk to End Users. If people need to be not just Data Modellers but UML modellers to understand GEDCOMX, then their audience just shrunk.

Like you, I was slightly surprised by the (apparent) absence of multiple sex values (apologies if anyone is disconcerted by that but it happens.)

Similarly but differently, I notice you can have multiple names but there appears to be no date against the values. Huh? Names like "Jackie Kennedy Onassis" have a dated relevance so should have a date (range). Again, my UML on this diagram type is non-existent, so I could be wrong in assuming there's no date.

I wait with interest to see what will happen to the concept of Family. My vague impression is that it is being replaced by Relationships (such as father-to-child, mother-to-child, spouse-spouse, etc.) Now, that has some justification but immediately creates problems for the loading of the millions of GEDCOMs that have attributes and events against the _family_. Sure these attributes and events can be loaded against the spouse-spouse relationship, but by making it explicitly spouse-spouse (rather than the nebulous family) it implies the children are not involved in (say) a spouse-spouse-residence, whereas a family-residence does imply they are involved.

Caveat again - I'm badly adrift at finding my way round this stuff, so anything could be wrong!
testuser42 2012-02-15T10:28:09-08:00
I found this via Tamura's Google+:
RootsTech Learning #3 - GEDCOM X and/or BetterGEDCOM and/or FHISO
http://geneapoppop.blogspot.com/2012/02/rootstech-learning-3-gedcom-x-andor.html

Good thoughts there.