> Data Models Home
> Data Model Summary Page
There are different ways to look at genealogical data, some practical and some theoretical. Depending on what the goals are, one or more of these approaches are useful to study or be familiar with.
Please be particularly careful when using information from other sources in this section, as the information applicable to these different data models may be under copyright or other usage restrictions.
The original GEDCOM Data Model actually has several versions. Please see the GEDCOM Data Model Page
for details. Please also note that discussion of the GEDCOM Data Model largely refers to the subpart of the overall GEDCOM standard that refers to genealogical data structures and formats. For example, in the GEDCOM 5.5 Standard, Chapter 2 describing the Lineage-Linked GEDCOM Form is the section that is applicable. Chapter 1 of the GEDCOM 5.5 Standard refers to the low-level data syntax which will be superseded by the use of XML.
GEDCOM sought to be a practical implementation of of a portable genealogical data model. It is not under development and is considered to be at the end of its development by its owner/developer, FamilySearch.
CommSoft's proposal from 1994 included some excellent modifications to lineage-linked GEDCOM, including both Events and Places as record level objects.
This data model, sponsored in part by the National Genealogical Society
(of America), is excellent in that it seeks to define a model that can include ALL genealogical data. However, it is so inclusive and open that it is probably unusable as a practical model (which its developers themselves recognize). It nevertheless remains a useful tool for considering genealogical data, and I encourage those interested in this topic to study it.
is an open-source software product available free for many operating systems. The GRAMPS model is noteworthy because it actually stores genealogical data in XML format, which is a hierarchical computer formatting language that is widely used and was proposed as a successor to the proprietary format used in the GEDCOM standard.
The DeadEnds Data model was created by Tom Wetmore to be the underlying model used in his DeadEnds system of genealogical programs.
GenXML was created by Christoffer Owe as an XML-based alternative to Gedcom. It is inspired by the Gentech GDM. There's a PDF which gives an in-depth description of the format.
WeRelate uses a slightly-modified version of GedML (with additional modifications to correct some ANSEL character mappings) to convert the GEDCOM to XML. It maps the raw XML produced by GedML into a restricted schema that is simpler to process. We have found that nearly all incoming GEDCOM data can be mapped to this simplified schema. (Data from incoming GEDCOM's that does not fit into the model is added as notes or text.) The resulting XML is then added as XML "data islands" to the various wiki pages. Part of the motivation for the model is to make it easy for end-users to understand differences. When a user asks for a "diff" between two versions of a page, they are see a wikipedia-like diff screen directly on the XML (and so far nobody has complained).If I had to do it over again I would change the model somewhat, but it's working well overall.
GedML is a set of strict GEDCOM-to-XML translation utilities which has no defining schema since it apparently relies on the underlying GEDCOM to be compliant. The GedML data model is actually the GEDCOM data model, but its includsion here is useful for the purposes of understanding the basic differences between GEDCOM and XML in terms of syntax.
Original distribution: GedML.zip
and modified version that is referred to in the WeRelate Data Model gedml-dist.zip .
GenBridge is a commercial product owned by WhollyGenes
, the makers of The Master Genealogist. GenBridge is a technology that WhollyGenes licenses that is capable of understanding several genealogical data formats. However, its developers note that it is an import
technology and not designed for exporting of data. As a commercial product, it is not available to dissect its data model, but it is an important product to make note of.
In a discussion "Event-oriented genealogy software for Linux" on the usenet group "soc.genealogy.computing", Nick Matthews posted about his project. See Message-ID: <email@example.com> or GoogleGroups
He says it "is at a very early stage, there's no functional program yet but there is code and the beginnings of a practical (I hope) database design at thefamilypack.org"
From the homepage: "The Family Pack is an ambitious open source project to create a new cross platform genealogy program. The project will involve designing a new genealogical database, creating a program to make use of it, and finally, organising ways of providing some standard universal data sets."
The database follows the GenTech Model, there is a Reference Entity in the place of GenTech's Assertion Object. This could be a good test to see if that model is indeed to complicated for real-life software, or if it can be used after all.
See Tamura Jones' excellent article giving an overview of GEDCOM alternatives. It includes the history of many of the data models mentioned above as well as a number of others. There are also many links to the various models at the bottom of the article. (Note that Internet Explorer less than Version 9 cannot view Tamura's pages.)
Other Software Product Data Models
Please add any others you know of or are familiar with that would be useful to examine or consider, particularly if you represent any of the efforts to devise a GEDCOM-like standard based on XML.