Outline Of The GRAMPS Data Model
This describes aspects of the Gramps data model for genealogical information. It was taken from the Relax NG file of version 1.4.0 of the Gramps specification.
Certain parts of the Gramps model are not covered here if they don’t apply directly to the different Gramps records, their contents and their interactions. The Gramps Relax NG file can be inspected for the full specifications.
All Types are either from a defined, preset collection, or user-defined, custom type.
Record Types
Gramps uses eight record types: Events, Persons, Families, Sources, Places, Objects, Repositories, and Notes.
Event Records
Event records represent happenings in peoples’ lives. Here is the how Gramps structures Event records:
- ID (optional; required) – Every Gramps record has two identifiers: a short, possibly non-unique, optional string that users can see and modify; and a long, required, permanent, unique string that the user cannot change and that other records use to refer to this record. The unique identifier (called a "handle" in Gramps vocabulary) is currently not as long as a typical UUID, but plays the same role that a UUID would. However, turning the handle into a proper UUID is under consideration.
- Event Type (required) - One of a preset collection of types, or a user-define, custom, type.
- Date (optional) – Date when the event occurred. Gramps has a specific format for writing dates, and can convert many calendars into this format. Gramps also has a method for representing years with alternate New Year days, and other date variations.
- Place (optional) – Place where the event occurred. Note that Gramps has stand-alone Place records, but this field seems to be a way of placing a full Place directly inside the Event record, rather than having the Event record refer to a Place record. TODO: Ask why this is so. => Maybe because there is no internal place database, place is a new object, which can be shared. So we share, or if need, to edit (there is a button for adding a new place on place list).
- Description (optional) – Text that describes the event.
- Attributes (any number) – Key and value pairs the user can add to describe any pertinent attribute of the event. User defined keys allow any number of custom attributes.
- Note references (any number) – References to Gramps Note records that provide information about the Event. Unlimited attached records.
- Source references (any number) – References to Gramps Source records that provide information about the source of the Event. Only one reference but unlimited attached records.
- Object references (any number) – References to external files that contain information, e.g., images, about this Event. Only one reference but unlimited attached records.
Event records do not conatin references to other Event records or to Person or Family records, so Event records cannot reference the Person records who play the roles in the Event. However, Person records and Family records have Event references that refer to Event records and also specify the roles they play in the Event.
Person Records
Gramps Person records represent persons. Gramps does not explicitly distinguish between Person records taken directly from evidence versus Person records built up through research to hold a combination of information from many sources. Gramps Person records can exist anywhere along the evidence/conclusion continuum, and can move back and forth in this continuum based on user actions. The format of the Gramps Person record is:
- ID – see Event record.
- Gender (required) – M, F or U.
- Name (required) – Person’s name. Gramps has a specific sub-structure format for writing names.
- Event references (any number) – These are references to Event records that this Person plays a role in. An event reference has its own internal structure:
- Id (required) – the index value of the Event record referred to.
- Priv (optional) – a value that is either 0 or 1 (1 means the record is private, 0 public)
- Role (optional) – the role this Person has in the Event. All types are either one of a predefined set, or a custom, user-defined type.
- Attributes (any number) – see Event record.
- Note references – See Event record.
- LDS Ordinations (optional) – See the NG Relax specifications for more information.
- Object references – See Event record.
- Addresses (any number) – Address values have their own internal structure. They are used to specify different places the Person has lived and the time they lived there. Other models treat this concept differently, dealing with it as another kind of event usually called Residence.
- Attributes – see Event record.
- URLs (any number) – URL values are simple structures that refer to URLs that have information pertaining to the person.
- Child-of references (any number) – References to Family records this Person is a child of.
- Parent-in references (any number) – References to Family records this Person is a parent in.
- Person references (any number) – References to other Person records. Person references have a relationship tag to specify the relationship between this Person record and the the referenced Person record.
- Note references – See Event record.
- Source references – See Event record.
- Tag references (any number) – Tags are user-defined strings that can be assigned to records for any purpose. They do not appear in reports or GEDCOM exports.
Family Records
Gramps has Family records. There has been a controversy for many years about whether Family records are necessary or not, but most models decide they are. The Gramps Family record has the format:
- ID – See Event record.
- Rel (optional) – This is a tag that indicates the type of relation between the parent/spouse people.
- Father (optional) – This is a reference to a Person record that has the father role for the Family.
- Mother (optional) – This is a reference to a Person record that has the mother role for the Family.
- Event References – See Person record. These are events of the family (such as marriage).
- LDS Ordinances – See Person record.
- Object References – See Person record.
- Child References (any number) – These are the references to the Persons who are children in the family.
- Attributes – See Event record.
- Note References – See Event record
- Source References – See Event record.
- Tag References – See Person record.
Source Records
Gramps Source records are used to describe sources of genealogical information. Source records have the following format:
- ID – See Event record.
- Title (optional) – Title of the source.
- Author (optional) – Author of the source.
- Publishing Info (optional) – Publication information about the source.
- Abbreviation (optional) – Abbreviation for the source.
- Note References – See Event record.
- Object References – See Event record.
- Data Items (any number) – Data items are key/value text pairs, allowing any number of attributes be given to the Source record.
- Repository references (any number) – A Repository reference points to a Repository and can contain additional information, including a call number code and medium code for the source.
Place Records
Gramps employs Place records to represent places where genealogical events occurred. Having Places be records allow single Place records to be referred to by many other records, saving space and simplifying updates. The migration of places from being attributes of events (as in Gedcom) to being stand-alone records (as here in Gramps) has happened in many genealogical models. The format of a Gramps Place record is:
- ID – See Event record.
- Title (optional) – A title/short description of the place.
- Coordinates (optional) – The latitude and longitude of the place.
- Locations (any number) – A location is a value with a substructure that can include values for any of the following location components:
- Street
- City
- Parish
- County
- State
- Country
- Postal Code
- Phone Number
- Object References – See Event record.
- URLs – See Person record.
- Note References – See Event record.
- Source References – See Event record.
Object Records
Gramps refers to information outside the model through Object records. Each Object record refers to a file on the local computer. The format of an Object record is:
- ID – See Event record.
- File (required) – Name of the file, presumably in a full path representation, so it can be found on the local machine.
- Type (required) – The MIME type of the file.
- Description (required) – A description of the file.
- Note References – See Event record.
Repository Records
Repository records represent repositories that hold sources. A Gramps Repository record has the format:
- ID – See Event record.
- Name (optional) – Name of the repository.
- Type (optional) – The type of the repository.
- Addresses – See Person record.
- URLs – See Person record.
- Note References – See Event record.
Note Records
The Gramps model has notes represented in their own Note records. This allows any number of other records to refer to the same Note record. The format of a Gramps Note record is:
- ID – See Event record.
- Format (optional) – A code with value of 0 or 1 representing whether the note is preformatted.
- Type (required) – An indication of the type of this note.
- Styled Text (required) – The text of the note itself. Note that Gramps has the notion of styled text that allows basic level of text formatting including font, font size, color and so on.
- Tag References – See Person record.
Comments
What is the purpose of this outline? There is an informal goal as part of the better Gedcom effort to investigate and analyze existing data models for genealogy. The analysis is intended to both educate better Gedcom participants about the nature of genealogical data models, and also to analyze the models to see how well they support the genealogical process models that the better Gedcom team hope their formats will enable. Gramps claims that they have a very general purpose model that is used successfully by many users in many different ways and that it supports the processes that they believe their users use. Frankly, the Gramps attitude (that I have detected -- I'm not speaking for the better Gedcom effort at all, so just get mad at me if you get mad at anyone) has been rather condescending toward the better Gedcom effort. Kind of a "try me last" kind of attitude as if the better Gedcom effort, after stumbling around in the wilderness for awhile, will realize that Gramps had the answer all along. Let me be a nasty and say that although I think this is an excellent model, I believe it needs a modicum of tweak before it will fully support the Evidence and Conclusion Process. I will soon be describing this process so Gramps folk and tear it apart if they believes it deserves it. So let's get about and see if Gramps really is the answer.
For better Gedcom to do the full evaluation, better Gedcom will need to have some definition of the process they want to support. They don't have this yet. This is similar to a current buzz word in the software industry -- use case analysis. Better Gedcom will have to come up with something in this domain soon or arguing will go no where since no one will know when anything is right. My proposed Evidence and Conclusion Process will be my first attempt to get some use case like specfications on the table.
Tom Wetmore
It seems I do not fully understand all sentences. But you know the free software philosophy. You are free to study the model, in addition, there is a well [http://www.gramps-project.org/wiki/index.php?title=Gramps_3.2_Wiki_Manual_-_Entering_and_editing_data:_detailed|documented manual], this also gives an overview of the model. I will try to clarify (where you was not certain) your work on the dedicated page.
Relax NS 1.4.0 is the draft model, which should be used for next major release.
Priority is the user choice.
Gramps does not force anyone to use one form or seizure. That's why there is a name format selector, no internal place database, no source model (Gedcom fields).
For date, it seems to be that Gramps store into ISO format (YYYY-MM-DD). There is also some localized date handlers. Same idea for relationships.
Often generic methods (less specific form) and let the user use the model for its needs.
Gramps can parse and generate Gedcom quite well already
If you study its Gedcom parser. Please before saying that is is not legal according an x years old specification, try to ask them (or look at code). Some people stopped to develop Gramps because Gedcom issues were too tiring/boring. Gedcom validator does not exist and most of devs will love to see a BetterGedcom.
I just speak for myself and English is not my mother tongue. So do not worry if there is some misunderstandings about Gramps project goal or Gramps XML future. We try to make our best for keeping documentation up-to-date but do not have a lot of free time. Most of them will firstly improve the program, we cannot make long debates for a specification. To add some urls, quick descriptions about curent Gramps data model is maybe the most useful for your project, but we cannot stay tuned/connected every days.
I think what you are doing is a wise approach: study each of the existing data models, see what they have, what is missing, and then make a recommendation about what BG should be. That is not the general approach that I get from the rest of BG. It seems to be: "let's discuss from scratch what should go into a file format without being tainted by other data models." That's the part that I personally don't have time or energy for.
(Also, do not read too much into one developer's comments... gramps is a community, just like this one. I would not say that "BetterGEDCOM's attitude is that..." and the same is true for Gramp's)
Gramps model is an on-going compromise between what would be perfect, what we can implement right now, and what will work given that GEDCOM is always lurking in the background. That is, many developers have an idea of what would be ideal, what we have time to do, and that we can't be completely disconnected from the current de facto standard. Gramps data model is a living process of continual development, created by those who participate in that process. Nothing more, or less.
Greg has started a conversation on gramps-dev re BG:
http://gramps.1791082.n4.nabble.com/Hi-From-BetterGEDCOM-and-Methodology-Questions-td3039612.html#a3039612
But many threads over there are about related data model issues and XML.
I see BG going in one of two ways:
1) BG is discussed at length, perfected, decided, and then fades into oblivion as an interesting academic exercise because no vendor had buy-in and it wasn't adopted.
2) The people behind BG decide that the only way to make a difference is to make sure that their ideas are incorporated into some application. So they join or create open source, like-minded communities, and build all that they want or need themselves.
If all of the open source/API groups (WeRelate, Gramps, FamilySearch, etc) banded together, then I think something might result from this.
I'm not the data model expert at Gramps, but I'll try to fill in some gaps on your outline.
-Doug
This is not a direct proposal for betterGedcom, rather a "possibilities exist" which could be developed in parallel. I think YAML might fit Tom Wetmore's idea of a more readable format in comparison to XML, so someday writing a YAML-syntaxed version of a genealogy data model might be an interesting exercise -- someday. Again, I'm not saying drop Gedcom syntax for this, just consider parallel possibilities under the light of copyright differences. Regards.
Tom's approach is exactly what BetterGEDCOM would like to see happen. In fact, as I'm sure tom will tell you, I personally asked him to to this exact evaluation to post here in order to encourage others to do the same. This is a wiki, however, and one that's been open for less than four days. Also, this is supposed to be a discussion about the Gramps Data Model, not about predictions of what BetterGEDCOM turns into here on its fourth day. On a community wiki, no less, and not be shaped by its participants, not its organizers.
My big question about the GRAMPS data model is how it can handle evidence and theory development. Really, I must have time to study this to speak intelligently, so hopefully you
guys will work out some of the more salient points here.
Anyway, GRAMPS Data Model...?
Thank you for the GRAMPS Data Model.
I only have, at this early stage, of one question about "Family".
One of the issues in a couple of software packages has to do with terms that we see on screens and in reports.
Since we are trying to get this project started and some of run into the 'not so traditional' families should we start changing some of the traditional terms or add additional terms now rather then down the road a piece.
I am talking about the two terms Mother and Father as two people that make up the family. I think we need to consider a Family is made up of Relationships. There is a relationship between two people. That relationship may expand to a relationship between these to people and another person. (Parents adding a Child).
One of the attributes to these three people is the sex of these people, another the relationship between any one of the three individuals and the other two. Oh, and those relationships may change. (won't get into the change of the sex that might be changed over time)
I realize that you are presenting the Gramps Data Model, but in moving into a BetterGEDOM Data Model I hope that these issues be addressed in that Data Model.
Russ
Concerning family. Whether a data model should contain the family object or not has turned into a controversial point. Frankly I don't see all the fuss. Most family researchers come from cultures based on the father/mother/children family structure. So for the vast majority of researchers and the vast majority of the persons they research, this family structure is an important and organizing concept. I always question the sanity of persons who campaign to throw out well established and well understood concepts simply because they don't handle 100% of all situations that arise in the real world. I uncharitably considered such persons to be over-opinionated nuisances, sort of like myself.
Bob Velke's point that a family record type is not required in a genealogical database is an oft-quoted statement. But if you read the context you will find that he is not throwing out the family concept. He is merely stating that family records are not physically required in a database because families can be reconstructed on the fly by software if the database model supports the father/mother/child relationship, which if they don't, they aren't genealogical programs to begin with.
Personally I think it would be wrong (and downright disastrously stupid and the best way to get your genealogy program to reach the point of zero sales) to get rid of the family concept. I think the solution is better support for non-traditional family forms.
And sticking this comment right here in this GRAMPs discussion points out one the my pet peeves about the way Wikis distribute discussion topics all over the place. I know this family to be or not to be discussion is going on somewhere else in this Wiki, and that this response should be there or there also, but I'm too lazy to go look through the discussions going on on all the pages to discover where it's happening.
Tom Wetmore, over-opinionated nuisance
Where did I throw out the Family Object or record? I am only trying to understand, at a very high level, what GRAMPS is an does.
Also, sharing some of the things that are being discussed outside of GRAMPS and the BetterGEDCOM project.
We certainly need to keep the Family Record / Object. Relationships within the record are what keeps the family together.
I do suppose that any deviation from Mother / Father / Child terms might through any deviations from those terms out of the scope of Genealogy. The reality is, that Mother / Father / Child is changing or at the change is surfacing where, Mother / Mother / Child or Father / Father / Child is becoming a reality. Do we not consider this as part of our discussion?
A BetterGEDCOM file might be impacted of the sending or receiving application either accepts or rejects data based on a same sex 'family'.
Russ
There are decisions to made with this choice: what are family events vs individual events? Where is the marriage type (civil union, marriage) stored? How are relationships represented (eg, married, divorced, remarried)? Gramps has solutions for each of those, and plans for others (for example, http://www.gramps-project.org/wiki/index.php?title=GEPS_001:_Relationship_type_event_link).
-Doug
Sorry, didn't mean to imply intent on your part. I was just being my usual obnoxious pontificatory self.
Tom Wetmore