HOME > EE & GPS Support > About Citations

Moderator's Note: I had other plans for this page, but believe for now it might best be used as a place to catalog the various references to citations that have been discussed on the wiki. Will do my best to add to this information.

Note: I had included the date and time of each of the items that came from discussions, but that data was force to become an anchor. I assumed that defeats my purpose, but it looks like some of those links take you right to the post ON the discussion.

Indirect evidence Negative evidence Information Snippets Multiple sources-single citation

See also wiki page Citation Graphics and related page, Software Citations
---Citation Graphics presents images and related citations for about 20 sources (published books, published town vitals, census, research reports, etc.)
---Software Citations takes the same source (an 1880 U.S. census) through two or more software applications; includes GEDCOM output.
See also wiki page Citation Mechanics (better known as screenshots for Adrian)
---Citation Mechanics visually overviews the TMG process of building citations using "source elements" ["citation elements"]. Diagrams the difference between approaches taken by "lumpers" and "spitters."
See also wiki page Citation Specific Fields
--Citation Specific Fields outlines the makeup of the fields used for citation-specific ("assertion-level") input.
See also Wiki page Repositories-repositories
---Repositories-repositories presents screenshots of application data about repositories.
See also Wiki page Master Source
---Master Source provides screen shots of the "master source" input from various software applications.
Excel spreadsheet: Zotero field names (elements) Zotero Fields_alpha_97-04v.xls

Indirect evidence


As previously quoted, from _Evidence Explained_, indirect evidence is "relevant information that does not answer the research question all by itself. Rather, it has to be combined with other information to arrive at an answer to the research question."
Below are somewhat straightforward examples of items I consider indirect evidence; both are from my family file [emphasis added]:

1) Tombstone data or death record, as part of the evidence for a question pertaining to a person's date of birth (for the purpose of identification).
"Jones County Death Register #1" (1 Sep 1880 - 11 Nov 1897), transcription by …. ; database, Jones County GenWeb ( http://www.rootsweb. …) as p. 52, entry 621 for Mrs. Thomas, died at … 4 April 1888, age "85y (?)" 13 d, citing FHL film #… ; reports she was born in Michigan; buried … at …; from age at death (as 85 years, 13 days), compiler calculates estimated date of birth c22 March 1803 using TMG v6 date calculator.

2) Census census data, 1850 to 1870 for example, as part of the evidence for a question pertaining to family relationships.
1850 U.S. Census, Ray County, Missouri, population schedule, no city listed (District 75), sheet 618 (penned), page 310 (printed), dwelling 294, family 284, George F. Carle household, as of 27 Aug 1850; digital image, Ancestry.com (http://www.ancestry.com : accessed 18 Mar 2007), cites National Archives micropublication M432, roll 412. George is ae 39, born Ohio; his apparent wife is Elizabeth Carle, ae 32, also born Ohio. Apparent children are Richard, ae 13, Harriet, ae 10, John, ae 9, Lydia Ann, ae 2; and Elizabeth, ae 2/12, all children born Missouri except the eldest, Richard, said born Ohio.

Negative Evidence

Some examples of negative evidence that take the form of reference notes follow. These extracts were located Board for Certification of Genealogists Work Samples (http://www.bcgcertification.org/skillbuilders/worksamples.html).
Some formatting was lost in translation :), emphasis added.

From Kay Haviland Freilich, CG, "Was She Really Alice Fling? Righting a Wrong Identity"; published _Quarterly_ 88 (Sept 2000): 225-28: (quoting; some formatting lost in transfer)
X. Chester County Orphans Court—Minors, Seeds, 1758, Chester County Archives. Emphasis added. This file also contains an invoice dated 30 October 1758 for “doctoring Richard Seed.” He obviously died before 1785, given that he is not named in the 1785 or 1797 documents.
X. Doc. 2186, ibid., emphasis added. Chester County’s recorded wills do not include one for Abigail Seeds; and the present writer has found no record of a marriage for her.

From, "Who was the mother of James^2 Paule (1657-1724) of Taunton, Massachusetts?"; published TAG 73 (Oct 1998): 312-15 (quoting; some formatting lost in transfer):
X. Shurtleff and Pulsifer, Plymouth Colony Records, 3:122. See also Wakefield, "Richmond Family [....] The date and place of their marriage is unknown; it does not appear published in the vital records of Taunton or Newport.
X. Neither of Hannah Paule's parents is named in the colony's transcript of Hannah's birth record (Shurtelff and Pulsifer, Plymouth Colony Records, 8:69), but that transcript [...]"
X. ... There is no record of further action in this case; perhaps Hannah's removal to Plymouth ...

From, Roger D. Joslyn, "Rebecca, wife of Thomas^1 Josselyn of Hingham and Lancaster, Massachusetts"; published _Register_ 158 (2004):330-40 (quoting; some formatting lost on transfer):
X. Middlesex County Probate, First Series, 3:238–39; see also Rodgers, Middlesex County Records of Probate [note 4], 626–27. There are no probate papers for this estate.
X. Ardleigh parish registers, FHL 1,565,698. These two baptisms were discovered by Leslie Mahler of San Jose, California, and sent to Robert C. Anderson, who shared them with the author. Peter C. Nutt also examined a transcript of Ardleigh registers at the Essex Record Office (ERO T/R168/1) but found no other Joslin baptisms and no Joslin burials.
X. Actually, there are no Jude/Judd wills for persons from Essex parishes surrounding Radwinter in the time period 1400–1720 (F. G. Emmison, Wills at Chelmsford (Essex and East Herefordshire) [1400–1858], 3 vols. [London: The Index Library (The British Record Society, Limited), vols. 78, 79, 84, 1957–69], 1:239, 2:204–05).

Information snippets

In my working citations, I include information snippets. In a more final form (a biography, for example), the citations might appear a little differently. Pretty standard city directory citation in my file:
X. "Massachusetts City Directories," Salem and Beverly City Directory, 1886, p. 99 (Salem), entry for Elisha M. Bevins, fish dealer, 6 Washington sq, house at Beverly; digital images, _Ancestry.com_ (http://www.ancestry.com : accessed 16 October 2006).

Hypothetical variations for me:
X. "Massachusetts City Directories," Salem and Beverly City Directory, 1886, p. 99 (Salem), entry for Elisha M. Bevins, fish dealer, 6 Washington sq, house at Beverly; digital images, _Ancestry.com_ (http://www.ancestry.com : accessed 16 October 2006); his son Elisha M. Bevins, Jr. also listed, also a fish dealer with same locations; ad on page 103 for Bevins & Bevins at 6 Washington, no further information.

X. "Massachusetts City Directories," Salem and Beverly City Directory, 1886, p. 99 (Salem), entry for Elisha M. Bevins, fish dealer, 6 Washington sq, house at Beverly; digital images, _Ancestry.com_ (http://www.ancestry.com : accessed 16 October 2006). Separately noted, "Salem Vital Records, 1849-1910" report the death of Elisha M. Bevins 15 November 1885; 1900 U.S. Census reports his widow was still residing at 6 Washington ...

Note: I consider how the source will appear in the source list. For example, if I had volumes of entries from the Ancestry.com collection, "Massachusetts City Directories," I might think it's best to have the sources listed at that collection level. (In the case of census, for examples, my entries are now are organized/listed at the county jurisdiction.) Maybe I only have a couple directories, from different areas, so I just list them separately.

Multiple Sources in a single citation
From Kay Haviland Freilich, CG, "Was She Really Alice Fling? Righting a Wrong Identity"; published _Quarterly_ 88 (Sept 2000): 225-28: (quoting; some formatting may be lost in transfer) - See BCG Work Samples.
3. National Genealogy Hall of Fame (Arlington, Virginia: National Genealogical Society pamphlet, annually revised). See also Gilbert Cope obituary, unnamed newspaper dated 18 December 1928, Newspaper Clippings File, Chester County Historical Society
4. Futhey and Cope, History of Chester County, 550 (Fling), 591 (Seeds). Most records on the family in the 1700s spell the name Seed, but across time the name has been more commonly spelled with a final “s”. The plural form will be used consistently in this paper, unless quoting directly.
Alice probably married the Hugh McNamee who was paid a day’s wages in Chester County in 1756 (see Buffington-Marshall Papers, Chester County Historical Society), before moving to Hagerstown, Maryland. There, on 9 August 1800, Alice McNamee and Job McNamee signed an administrator’s bond for the estate of Hugh McNamee, deceased; see Washington County, Administrators Bonds, A: 22, available as Family History Library microfilm 0,014,662.
9. Chester County Orphans Court—Minors, Seeds, 1758, Chester County Archives. ... This file also contains an invoice dated 30 October 1758 for “doctoring Richard Seed.” He obviously died before 1785, given that he is not named in the 1785 or 1797 documents.
10. Edward Seeds Will, proved 7 October 1754, Probate File 1549, Chester County Archives. The other unidentified child may be Francis Seeds who is cited as “absconded” on the Chester County Tax Discount List, 1765, Chester County Archives.
11. The papers were part of the Gilbert Cope Collection, originally housed at the Historical Society of Pennsylvania, Philadelphia. According to employees of the local society, county residents often turned over to Cope papers they found in attics, basements, etc., because of his well-known interest in the area’s history.

From, Thomas W. Jones, Ph.D., CG, CGL, Merging Identities Properly: Jonathan Tucker Demonstrates the Technique, National Genealogical Society Quarterly 88 (June 2000): 111-21. See BCG Work Samples.
6. Letter, Ebn L. Tucker, Hartland, New York, to J. E . Heath, Pension and Bounty Commissioner, 25 April 1851; in Jonathan Tucker file, no. S42525, Revolutionary War Pension and Bounty-Land Warrant Application Files; microcopy M804 (Washington: National Archives and Records Administration [NARA]), roll 2420. Ebenezer’s death record—the only other document that identifies his parents—cites them only as Jonathan and Abigail Tucker; see Van Buren County, Michigan, Record of Deaths B:189, County Clerk’s Office, Paw Paw.
7. 1800 U.S. census, Cayuga County, New York, town of Aurelius, p. 706; NARA microcopy T32, roll 28. 1810 U.S. census, Cayuga County, town of Aurelius, village of Auburn, p. 38; NARA T252, roll 31. 1820 U.S. census, Cayuga County, p. 28; NARA T33, roll 68.
13. “Deaths, 1816–1824, from Auburn Gazette and Cayuga Republican, both Published Wednesdays in Auburn, New York,” Tree Talks 7 (September 1967): 132, provides a transcription of Jonathan’s death notice from the Cayuga Republican, 17 July 1822. The same death date is reported by Jonathan’s son, referring to a family Bible, in his 1851 letter to the pension office.
14. The Cayuga Republican, on 17 July 1822, reported that Jonathan died at the age of 60 years. In his own affidavit of 6 July 1820, Jonathan testified that he was “aged 58 years.”
15. Jonathan Tucker, Private, New Hampshire Second Regiment, roll 519 (18 cards); New Hampshire Second (Tash’s) Regiment, roll 523 (one card); New Hampshire Third Regiment, roll 531 (fourteen cards); New Hampshire, Kelley’s Regiment, roll 545 (two cards); all in Compiled Service Records of Soldiers Who Served in the American Army during the Revolutionary War, NARA microcopy M881. The thirty-fifth card extracts a muster and payroll dated 23 October 1776 for officers and soldiers of New Hampshire’s Second Regiment who were joining the Continental Army in New York. The date is too early to be Jonathan of Cayuga, and the service that Jonathan reported in his pension application does not match.
1. In scholarly genealogical literature, authors regularly correct same-name—same-person errors made elsewhere by other authors. See, for example, Margaret R. Amundson, “Rebutting Direct Evidence with Indirect Evidence: The Identity of Sarah (Taliaferro) Lewis of Virginia,” NGS Quarterly 87 (September 1999): 217–40; and David Kendall Martin, “Two Samuel and Hannah Hutchinses of Massachusetts and Maine,” The American Genealogist [TAG] 73 (July 1998): 172–75.
2. Helen F. M. Leary, “Is This the Same Man or Another One with the Same Name?” (lecture, NGS annual conference, Denver, May 1998), audiocassette recording available as DEN98-F144 (Hobart, Indiana: Repeat Performance, 1998), with printed matter of the same title published in Rocky Mountain Rendezvous: National Genealogical Society, 1998 Conference in the States, Program Syllabus (Arlington Virginia: NGS, 1998): 354–57; and Elizabeth Shown Mills, “The Identity Crisis: Right Name, Wrong Man? Wrong Name, Right Man?” (lecture, NGS annual conference, Valley Forge, audiocasette recording available as VFP-F135 (Hobart, Indiana: Repeat Performance, 1997), with printed matter of the same title published in Pennsylvania, Cradle of a Nation: National Genealogical Society, 1997 Conference in the States, Program Syllabus (Arlington: NGS, 1997), 315–17.
5. See, for example, two articles by the present author: “Howerton to Overton: Documenting a Name Change,” NGS Quarterly 78 (September 1990): 169–81, which shows that the identities of seemingly separate men, John Howerton and John Overton, are one and the same; and “A Name Switch and a Double Dose of Joneses: Weighing Evidence to Identify Charles R. Jones,” NGS Quarterly 84 (March 1996): 5–16, which provides evidence to merge the identities of Charles R. Jones of Jackson County, Florida, and Robert Jones of Caroline County, Virginia, amid details that seemingly conflict. For examples of proof cases in which ancestors used multiple surnames with no visible or audible similarity, see Diane Renner Walsh, “One Family, Two Surnames: The Hunt Alias Malloy Family of Illinois and Missouri,” NGS Quarterly 86 (June 1998): 94– 115; and Elizabeth Shown Mills, “The Search for Margaret Ball: Building Steps over a Brick-Wall Research Problem,” NGS Quarterly 77 (March 1989): 43–65.


GeneJ 2011-06-01T18:34:34-07:00
Multiple sources in a single reference note.
See the examples.

Multiple sources in a single reference note is discussed in another area of the wiki.
See http://bettergedcom.wikispaces.com/message/view/Defining+E%26C+for+BetterGEDCOM/39811352

(1) If you view the entry for "source of the source" as a second source, then its possible quite a large group of my reference notes refer to more than one source.

(2) During the research process, if it adds value directly to the understanding/interpretation of the originally cited source, I add additional sources to the reference note. See the example of the citation for the marriage of William and Aseanath.

(3) Similarly, two sources can be almost co-dependent, so I will put both sources in the same reference note. Consider the reference note that reports results from TMG date calculator based on age at death inscribed on a tombstone.

(4) I comment about negative evidence in the same place the corresponding evidence appears, so my negative evidence is often added to a reference note as a second source.

(5) In the research phase/working file, I often have several different reference notes pointing to a single pfact. My software allows me to prioritize the listing of those multiple reference notes--I can keep the reference note about my best evidence on top. (See no. 6)

(6) (See no. 5) In a final presentation, say I was publishing a book, I'd want just one note reference (the number element) in a single spot. If my working file had two final notes referring to the same bit of data, I'd merge/concatenate those remaining two notes into one note. See Evidence Explained (2007), p. 51 and Chicago Manual of Style online, section 14.23 and 14.62.

Hope this helps. --GJ
GeneJ 2011-07-14T20:22:36-07:00
FamilySearch presentation about some source system terminology for your reading pleasure
Robert Raymond (FamilySearch), "Interoperable Citation Exchange 2009-03-11.pdf"

louiskessler 2011-07-16T11:47:14-07:00

That presentation is excellent. That is the sort of work we here at BetterGEDCOM should be doing.

AdrianB38 2011-07-16T12:01:01-07:00
"ESM is a *very* firm believer that genealogy software can not and should not be used to write narrative reports and has said so on several occasions"

How interesting. So do I and so have I!

Having said that, not everyone has delusions of adequacy as an author and even I am prone to running off a quick narrative report to pass to someone. Or even to use as the basis for a human written report.

Hm. Anyone notice I said "run off a quick narrative report" and not "Export a GEDCOM"?
AdrianB38 2011-07-16T12:03:11-07:00
"That is the sort of work we here at BetterGEDCOM should be doing"

Then let us change the goals of BetterGEDCOM.
GeneJ 2011-07-16T14:54:40-07:00
@Adrian/Andy .. narratives

Stylized narratives here in the states are part of our genealogical standards. I know the larger vendors here have had those standards explained rather directly. Some just choose not to invest in the necessary coding. Others, like GenBox's ca2002 effort, are about dead on.

I believe Mills would agree (for Quarterly style) with the quote from Hoff and Leclerc (2004) about Register Style. Hoff and Ullmann write (ch. 2, p. 21), "Authors should be aware that genealogical computer programs claiming to generate a "_Register_-style report" are usually quite deficient. Such a report will need substantial editing by the other prior to submission in Word."*

As I see it, it is BetterGEDCOM's job to improve upon the data standards in support of the whole process.

*P.S. Andy ... TMG shorthand ... If TMG had a birth-death-marriage option and enabled "recognize the full spousal identification," it would help greatly.
GeneJ 2011-07-16T15:27:07-07:00
@Adrian ...

Separate from getting the case done (Sheriff William's identity crisis), I've also been working on graphics about the same topics you mention. (Master Source, Source List, etc.)

(1) GEDCOM tried to use the "source list entry" as the master source, just as Raymond did. We can and should do a better job. I'm Working on some census graphics to explain what I mean, but my bibliographic entries are entered at a higher level.

(2) The term "citation detail." I caught that slip, too. I'm not sure the "instance" is best developed as a single element.

I'll go work some more on those examples ...
AdrianB38 2011-07-17T03:32:16-07:00
Gene: "GEDCOM tried to use the "source list entry" as the master source ... We can and should do a better job"
And here was me thinking you were just after making a small change to GEDCOM! <grin>

More seriously - censuses might be a good example because they must be pretty similar across the globe, hence it reduces the scope for misunderstanding. The only _major_ differences I see between those in the UK (1851+) and in the USA (1850+) are that your earlier ones exclude relationships and their censusing was done over a period of weeks. In terms of their arrangements, they are similar - OK you seem to use EDs whereas in England (but not Scotland) we ignore the EDs in favour of our National Archives document references, but otherwise...

Now, there's still plenty of scope for deviation genealogically in the UK. I create a source record for each household on the census (i.e. each household comes out as a level 0 SOUR record on my GEDCOM - sorry to talk techie like this but it's the only way I can be certain we're clear). If I had a bibliography in my reports (Family Historian doesn't have bibliographies), it would therefore presumably come out with an entry for each household. This seems a waste of time to me and it would seem much more sensible to have one bibliography entry for the whole of the English 1851 census (say). (NB - I deliberately said "English" there as the Scots censuses are held elsewhere, and are organised and accessed differently).

Conversely, I know there are others who just have one level 0 SOUR record for the whole of our 1851 (whatever) census, which means all their comments at the household level (like - how do we know this is the right household?) have to be repeated several dozen times in their reference notes linking the several dozen "facts" to the source record for the census.

Maybe talking round these sorts of issues will help.
GeneJ 2011-07-17T06:21:04-07:00
You wrote, "Conversely, I know there are others who just have one level 0 SOUR record for the whole of our 1851 (whatever) census, which means all their comments at the household level (like - how do we know this is the right household?) have to be repeated several dozen times in their reference notes linking the several dozen "facts" to the source record for the census."

Believe we are saying the same thing.

My source list entries are more general than are my "master sources."

Although I'm usually dealing with census stateside, I also set a "master source" at the household level.

For US census, my source list (bibliography) reports at the year/county level, but I suppose I could set it even higher.
AdrianB38 2011-07-17T07:05:15-07:00
I suspect we are - but I've said that before. I think examples would help because terms like "source list" "source list entries" and "master sources" are meaningless if, like me, you use an application that has just one thing called a source (the thing that comes out as a level 0 SOUR in the GEDCOM) and no (unfortunately) bibliography. They're not meaningless terms to you - it's me who has the problem.
GeneJ 2011-07-21T06:06:44-07:00
@Slid 15, Raymond uses the example of a published book and then writes that a "source list entry" "includes repository" and, conversely that the first reference note "excludes repository."

Believe he confused the principles about citing repositories.

(1) We usually don't cite the repository of a published book (regardless of whether the citation is a source list entry, a label or a reference note).
Mills, EE (2007), p 51, "When citing manuscripts that exist in only one place, the identity of that repository is an essential part of our citation. When citing books, film, and other published materials that are widely available, the name of the repository at which we used the source is not included in our formal citation. (Libraries holding copies of specific published works are identifiable through the Online Computer Library Center’s World- Cat database, and similar resources.) ... Mills goes on to explain that, for a published book, citing the library/repository convenient to one user would be of little value to someone who lived elsewhere.

(2) When the repository is included/cited (because it's relevant to locating items not published), that citation element is usually included in all the primary forms of citation--the source list entry, source/document label and _first_ reference note. (Most users wouldn't include reference to the repository in a _second/subsequent_ reference note (ie., a short form reference note).
GeneJ 2011-07-21T06:45:23-07:00
More about the posting just above.

See Raymond's slide 24 (of 83 in my version), for the his exchange with Mills, who again clarifies, "In printed citations of published works, don't display the repository or call number unless copies are difficult to find."
AdrianB38 2011-07-21T09:54:19-07:00
"In printed citations of published works, don't display the repository or call number unless copies are difficult to find."

On the other hand, it's also said to be useful to store them to remind yourself where you found it.

Clearly - because I'm not arguing with your principle - it would be useful to have a means of storing but not printing bits like repositories. At first sight this would be accomplished through an output template for that source type. However, some books would show repository and some not. You could have 2 book templates "Book" and "Old Book" - but this is getting ridiculous. I think it would be better to allow a means to suppress each item from printing in the citation on a source by source level - i.e. some sort of marker against each item on each source.

That is an application feature, it seems to me. Question: Should suppression markers of this sort be output on a BG file? i.e. if one person stores but does not print for this one source - would they want that setting to go onto a BG file?
GeneJ 2011-07-21T12:09:44-07:00
I'd personally rather see the elements built up, so you either include or don't include that element.

But if vendors use suppression, we would want the ability to pass it on.
GeneJ 2011-07-14T21:00:28-07:00
P.S. Slide no. 21 refers to "Smallest-to-largest."

See also, EE (2007), p 118-19 for International Differences, "Tradi- tionally, citations in the United States have used a dual system that calls for one arrangement in reference notes and another in the Source List." ...

Mills goes on to recognize that in the US, reference notes citations "start with the smallest element in the citation and work up to the largest (the archive and its location)." In contrast, she also writes that internationally, reference notes typically "start with the largest and work down to the smallest."

See the referenced text for her further comments.
AdrianB38 2011-07-16T08:18:28-07:00
I'm not sure how many realise this but the question of smallest to largest or vice versa will NOT be an issue for a well-designed data model for storing "citation" data.

Quite simply, it should store the individual items from the "citations" (i.e. from the bibliography, first and subsequent reference notes) at what I might term the "atomic level". By that term, I mean - the level at which the data has been broken up (or broken down??) as much as is possible and no further than is necessary. Thus, "author's name" is not at the atomic level - it needs to be broken down further into author's-family-name and author's-given-name because the Chicago Style uses
- "author's-family-name, author's-given-name" in the bibliography;
- "author's-full-name" in the first reference note
- "author's-family-name" in the subsequent reference notes.

NB "author's-full-name" means the author's full name in whatever their own reading order is. For the vast majority of British names, it would be given name then family name. For others, it could be family name then given name.

So, for the author's name, we need to store author's-family-name and author's-given-name plus (presumably) a marker informing us which order the names are in. Similarly, we store each of the elements (largest through smallest) but marked up by what they are.

Thus, Gene could store her data and produce reports in US style smallest up to largest. If she sent me a BG file of her work, then I could, by application of the appropriate instructions, print them for UK consumption in the order largest through to smallest. Having, of course, prefixed them by some text like "Gene J references..."

There are several bits I've skipped over here, of course, such as whether one can look into the future far enough to know that you've broken the elements down far enough to satisfy not just today's formats but tomorrow's as well.

I also ignore whether or not the BG model should cover not just the citation elements themselves but also the templates for how that data could be printed.
GeneJ 2011-07-16T08:38:01-07:00

AdrianB38 2011-07-16T09:16:16-07:00
I've just spent some time reading that linked document on FamilySearch's Interoperable Citation Exchange (ICE). I note it seems to have been March 2009 when it was produced, and it said (in hope) that the "planets have aligned". Anything happen, do we know?

The document makes an honest attempt at reviewing various things and is useful for comparing terms in software. But it shows just how difficult words are by deprecating "citation detail" on slide 32 and then using it on slide 64!

And frankly, I'm still confused by this term "Master Source". Slide 31 says "Master Source - This term is used in many database management programs to refer to individual items in your Source List." Well, without a definition of Source List, this isn't clear, but I _think_ it means "This term is used in many ... programs to refer to individual source-records in your file." What come out as level 0 SOUR entries on a GEDCOM. But then the same slide says "Typically, a Master Source is a simplified citation". Sorry - is it a record in a file or a citation describing where to find the original source for the record in a file?

Can I assume that a Master Source is an individual record of a source in my file? Where record can also mean a row in a table of source, to put it in database terms?

Incidentally, I get a distinct feeling that ESM has written EE etc from the viewpoint of someone writing reports by hand (or keyboard) - or at least, finishing them off thus - e.g. slide 66 deals with the issue of someone wanting to condense their bibliography. The advice refers to EE and seems to skate over software limitations. I'm saying this without a copy of EE, only the earlier Evidence!, I should add.
Andy_Hatchett 2011-07-16T09:25:49-07:00
ESM is a *very* firm believer that genealogy software can not and should not be used to write narrative reports and has said so on several occasions. She and Bob Velke go round and round on this every so often...fun to listen to them!
GeneJ 2011-07-21T15:25:32-07:00
accessibility principles
There are a fair number of citation principles that relate directly to accessibility (my term).

Every single source I have falls now into one of three groups:

Privately held
Held in a formal repository

Now, information we access via the internet falls into the latter group.

There are a fair number of citation elements that relate directly to accessibility. (Again, my term.)

I'm wondering, hasn't Internet access hasn't gained enough repute to be considered it's own access category?

AdrianB38 2011-07-22T07:14:00-07:00
It's a good question. Very often people quote a web-site in the data item "Repository". I tend to go with the old-fashioned view on this and say it isn't a Repository. The rest of this post may get a bit theoretical.

The usual definition of Repository is (in summary) "Where the data actually is". On that basis, people have said, digital data is actually on servers and not on the web-site itself, so putting "www.ancestry.com" as the Repository is not right and since no-one knows (or cares) about the server names, we cannot complete the Repository name for digital data.

Well, OK. Maybe. Not sure you can be quite that simplistic about names, but never mind.

On the other hand.... I suggest that the definition of Repository above is wrong. I suggest most people use it not to say "Where the data actually is", but "How to get to the data" (i.e. accessibility). Take the example of documents stored off-site. If I go to Chester Archives, having ordered some off-site copies, what do I put into the Repository for them? I won't put "Winsford Salt Mine" because, even if I knew they came from there, it's fairly pointless. "Chester" is all that's needed - probably with a note somewhere to say "Held off-site at time of consultation".

Under the "How to get to the data" definition for Repository, www.ancestry.com does very nicely as a Repository.

However.... (again)

We also need to store the equivalent of "what edition is it?" for digital data (i.e. what's the publication data?). With a book it's easy - look in the front pages and it's the 1927 edition published by Cambridge University Press. There is no equivalent stamp for much digital data - all we can do is quote the access date and the URL so we can see whose version of the California Birth Index I used. So that access date and URL go into the publication data for a web-site.

We therefore have the possibility of a web-site's address (URL) being in both the publication data and the Repository. On the principle of not repeating values, and not wishing to split address from access date leaving half an item in the publication data, I would tend to agree that one should put the address in the publication data, and leave the Repository empty for internet sourced data.

But I am unhappy about this - it means that while I can run off a list of everything I got from Chester, I can't run off a list of everything I got from Ancestry in the same manner.

What's clearly needed is the ability to do what my application can't, and that's the ability to store Repository (and other items) but - for some types of stuff - not to output them in citations.

Notice again, we have (or I'd like to have) a difference between what is stored in the data and what is to be printed.
GeneJ 2011-07-28T10:09:21-07:00
@Adrian ...

Are you saying you want to record say "Ancestry.com" as a "little r" repository? So your administrative repository list might read:

Ancestry.com (1), 360 West 4800 North, Provo, UT 84604 Phone: 800-262-XXXX (url: http://corporate.ancestry.com/contact-us/)
Ancestry.com (2) (URL: http://www.ancestry.com )
Byron G. Merrill Memorial Library, 10 Buffalo Road, Rumney NH 03266, 60-786-XXXX. Contact: Susan XXXX. (URL(1): http://rumneylibrary.blogspot.com/ , (2) http://www.rumneynh.org/rumney_web_12160_000014.htm )
Columbiana County Archives and Research Center, 129 South Market Street, Lisbon, Ohio 44432. Phone 330-429-XXXX (URL: http://www.columbianacountyarchives.org/ccarc2009_001.htm )
Jones County Historical Society, 13838 Edinburgh Road, Scotch Grove, IA 52310. Contacts: Byron XXX, 563-587-XXXX (URL: http://www.rootsweb.ancestry.com/~iajchs/jchs.html )
Center for Archival Collections, Bowling Green State University, Bowling Green, OH 43403-0001 (URL: http://www.bgsu.edu/colleges/library/cac/ )
New Hampshire State Library, 20 Park Street, Concord, NH 03301. Contact: Edward XXXX, 603-XXX-XXXX (URL: http://www.nh.gov/nhsl/ )
The Church of Jesus Christ of Latter-day Saints, Mesa Arizona Large Multi-Stake Family History Center, 41 S Hobson, Mesa, Arizona, 85204
Family History Library, 35 North West Temple Street, Salt Lake City, Utah, 84150, 866-406-XXXX (URL: http://www.familysearch.org )
FamilySearch (URL: http://www.familysearch.org )

In the alternative, are you interested in being able to segregate and identify all the sources associated with online access? --GJ
GeneJ 2011-07-28T10:14:13-07:00
p.s. ...so much for my attempt to alphabetize.
AdrianB38 2011-07-28T12:56:03-07:00
1. I don't see why I would want to identify all the sources or repositories associated with online access. I just want to identify them all, and record the access for them all, and preferably without breaking ESM's rules of pointless repetition.

2. Therefore, yes, I'd like to see the ability to record Ancestry.co.uk (that's where _I_ get on it!) as a repository because that how I access the data.

But at the same time, I'd like to record Ancestry.co.uk as the publication mechanism.

3. As a consequence of putting it in both, I'd need the ability to suppress one in citation formats to stop pointless repetition of the words "Ancestry.co.uk".

4. Just as a thought - should we not be using "Ancestry.com Operations Inc.", or similar, to avoid the lack of clarity whether we mean the web-site or the company?
GeneJ 2011-07-28T15:03:02-07:00
I'm going to start a wiki page "About Citations" > Repositories and repositories
GeneJ 2011-07-21T18:02:19-07:00
Terms for Processed Records (formats)
EE (2007), p. 28-31.

Database / Index
Duplicate copies
Duplicate originals (counterparts)
Image copies
Record copies (aka Clerk's copies)
Transcriptions (unedited)
Transcriptions (edited/embellished)

At pg. 36-37, Mills writes, "The reliability of every derivative work is influenced by the degree of processing it has undergone ... Each additional layer of processing adds to the likelihood of transcription errors ... we diligently track the ancestry of derivative works, hoping to trace each to its original source or, at the least, to the earliest extant version."
GeneJ 2011-07-21T18:18:46-07:00
Citation privacy issues
EE (2007), p. 45.

"Citations that involve living people must respect privacy."

(1) Minimally identifying (name, city, state)
(2) When publish or circulate (print or online), should have authorization so long as the informant survives
AdrianB38 2011-07-22T04:15:41-07:00
This is an example of the extra layer (what were we on - the 4th layer?) that Tom mentioned a while ago.

The raw data was the first, then we went up, via at least one intermediate level, before we reached the layer of the conclusions about people, families, places, etc. But then - on top of that layer - we have yet another layer (or view) about how those conclusions are to be published.

Specifically, in my database I must keep the full details in those data items that make up the citations.

It's only when the stuff gets printed, put on-line, or put into a GEDCOM for sending elsewhere - err, maybe! - that I should apply the next layer (or view, if you prefer) to deduce which items should be omitted.

The question is - how much of this extra layer / view should be in BG (or even a slightly modified GEDCOM) - is there any point in me putting an email address onto my GEDCOM marked up to say "Do not publish"?

PS - I don't have an answer to that last question.
AdrianB38 2011-07-22T04:21:16-07:00
This is an example of the extra layer (what were we on - the 4th layer?) that Tom mentioned a while ago.

The raw data was the first, then we went up, via at least one intermediate level, before we reached the layer of the conclusions about people, families, places, etc. But then - on top of that layer - we have yet another layer (or view) about how those conclusions are to be published.

Specifically, in my database I must keep the full details in those data items that make up the citations.

It's only when the stuff gets printed, put on-line, or put into a GEDCOM for sending elsewhere - err, maybe! - that I should apply the next layer (or view, if you prefer) to deduce which items should be omitted.

The question is - how much of this extra layer / view should be in BG (or even a slightly modified GEDCOM) - is there any point in me putting an email address onto my GEDCOM marked up to say "Do not publish"?

PS - I don't have an answer to that last question.
AdrianB38 2011-07-22T04:34:20-07:00
Apologies for multiple posting - I thought it wasn't getting through. I think that crossing the Atlantic from UK to USA is going up-hill judging by the traffic problems I'm having in that direction at the moment.
GeneJ 2011-07-22T06:41:17-07:00
Okay ...

I'm sorry, how does this have anything do do with an extra layer??

Posting to web = publishing
User to user = sharing

As for the forth universe, believe I'll post in the discussion link below:

I know many users saw Mills citation principle about "address for private use" as a major step forward when EE (2007) was published. --GJ
AdrianB38 2011-07-22T08:05:13-07:00
"how does this have anything do do with an extra layer??"

Because the published version is different from the shared version.

That's an extra layer for me, because GEDCOM sits at the level of the share.

Do we want to include information about how to publish in the shared version? I'm not sure we do, but either way we need to be clear about it.
GeneJ 2011-07-22T08:26:30-07:00
If you post a gedcom to the internet, you have published it, no?
AdrianB38 2011-07-22T11:51:12-07:00
Sorry - yes - I forgot you might use a GEDCOM to load an internet site.

But - the published version of a GEDCOM is almost certainly still different from the shared version, when there are privacy issues to consider.

If you and I were working on the same families, and I trusted you totally to be professional with the data, I would send you a GEDCOM to share with the full details on it.

But if I wanted to load the same families onto an internet site - maybe an Ancestry tree, I would rip out all the bits I wanted to remain private and / or change the names, dates, etc.

So the published version is different from the shared version. Sometimes the citations include the full details (the shared version), sometimes they don't (the published version). That published version therefore exists in a different universe, if you like, the one with less than 100% of the data there. That's what I want to get over - the difference exists and it's all to do with publishing.
GeneJ 2011-07-22T14:02:05-07:00
Exactly ... and if I were transferring my own file, I'd want the details to move, too.

If I wanted to share with Susi NewCousin, I might want to keep those bits private, but with Jimbo OldCousin, I might feel differently.
GeneJ 2011-08-01T12:11:11-07:00
The RootsMagic privacy settings and GEDCOM export options are now part of a graphic.

See the file graphic: TransferProblemVisualized_RM1880_US_Censusb.png

In context, as below:
GeneJ 2011-07-21T18:32:08-07:00
Record Types from the EE Table of contents
I wonder how these compare internationally

Archives & Artifacts
Business & Institutional Records
Cemetery Records
Church Records
Local & State Records: Courts & Governance
Local & State Records: Licenses, Registrations, Rolls & Vital Records
National Government Records
Publications: Books, CDs, Maps, Leaflets & Videos
Publications: Legal Works & Government Documents
Publications: Periodicals, Broadcasts & Web Miscellanea
AdrianB38 2011-07-22T06:37:24-07:00
OK - I'm cheating slightly because I'm also looking at that evidence_style.XLS spreadsheet to see what's in the categories you quote.

But here goes:
Over here (in the UK) a place where people are buried is (99% of the time) either a graveyard attached to a church / chapel or a cemetery administered by local government. There WERE private cemetery companies. Not sure if they still operate. Cemetery and graveyard do not have the same meaning over here.

So for me in the UK - taking the literal meaning of cemetery, as above, "Cemetery Records" are part of "Local & State Records". BUT this misses the fact that church records may contain records identical to some (not all) of the Cemetery Records. To state the obvious, a headstone on a grave should be described identically no matter whether it's in a local government cemetery or a church graveyard.

The term "state government" would generally be understood in the UK to mean that applying to the UK as a whole. We have one state. (Though the devolved governments of Northern Ireland, Scotland and Wales confuse the issue).

Even the dividing line between local and central government wanders over the years. While the Army has always been a central government thing, when the Militia was recreated in the late 1700s to deal with the threat of French invasion (not to mention the odd American warship in the War of 1812!), the Militia was a responsibility then of the Lord Lieutenant of the county - i.e. "local" government. Only well into the 1800s did the Militia become administered by the War Office in London.

Again, to quote other oddities, local rates have been administered by local (civil) government for decades but in the early 1800s were administered by the church since the parish was originally the basis for all local government.

I recognise the many types of document have to be split somehow - this split may work for the US but creates anomalies for the UK.
GeneJ 2011-07-22T08:36:25-07:00

So, we would have agreement UK to US as far as "types" about:

Archives & Artifacts
Business & Institutional Records
Publications: Books, CDs, Maps, Leaflets & Videos
Publications: Legal Works & Government Documents
Publications: Periodicals, Broadcasts & Web Miscellanea

And we should look more closely at:
Cemetery Records
Church Records
Local & State Records: Courts & Governance
Local & State Records: Licenses, Registrations, Rolls & Vital Records
National Government Records

We have the same military/militia jurisdictional variance. We need to look at that. One of my rev. war folks served in both the continental line, a then version of what became central gov (say 1 year) and a state militia (3 years) during that war. His service then continued in the state militia after the war.

GeneJ 2011-07-31T06:55:46-07:00
@Adrian ... re "taking the literal meaning of cemetery, as above, "Cemetery Records" are part of "Local & State Records". BUT this misses the fact that church records may contain records identical to some (not all) of the Cemetery Records."


I'm building some a tombstone inscription citations and find in EE, 2007, p. 220," ... Americans typically use the term cemetery to refer to all types of burial grounds, Australians often refer to ... graveyards, and British researchers typically make a distinction between a cemetery and a churchyard." And then also, "U.S. researchers tend to speak interchangeably of inscriptions from cemeteries, grave markers, gravestones, and tombstones, while in Scotland and England, the term is typically monumental inscriptions."

Mills comments, "You should use the appropriate term for the culture in which you are doing research."
GeneJ 2011-08-01T09:50:15-07:00
Terminology again. See the graphic here:


Shall we refer to these high level labels as "record types" (or "source types" or "source groups")?
AdrianB38 2011-08-01T13:53:38-07:00
Sorry Gene - you're pressing all the wrong buttons here! One of the things I explained zillions of times in my previous life as a software guy was the difference between Locomotive Type and Locomotive Class. Well, of course, it's not remotely self evident and the use of Type and Class (rather than Class and Type!) was just an arbitrary naming convention dating back to Southern Pacific Railroad from whom we'd bought the software.

IF (and it's a big if) we can come up with a slightly longer name to explain what the type (or class or...) means, then we should.

In your diagram, >>if<< Source Type were to be used solely for Master Source Templates, then we should call it Master Source Template Id. However, I suspect that might be putting cart before horse and several apps may have a Source Type but not use it for selecting Templates.

Re your suggestions:
- just calling it "record types" is not a good idea since there are all sorts of record types, from individuals, through families to sources.
- "source group" I dislike because "group" could be just some arbitrary selection. It doesn't convey the idea of some underlying purposes.

Can you come up with a short explanation of what RM uses these things for? And would that apply elsewhere? If RM uses it for for Master Source Templates, then in RM it would be best to call it Master Source Template Id.

But what other uses are there elsewhere?
GeneJ 2011-08-01T15:50:19-07:00
See my just now post to prioritization under Geir's, "... Architecture for Sources ..."

Whether we call it type or class, or source or item .. I hope we can come together on a high level terminology around which we can find agreement.

As a note of personal bias, I don't think the set of Mills "QuickCheck" models should form the basis of a standard. It should be just as easy for me to cite a vital record pulled from FamilySearch's digital image library as it should be to cite a digital image of the same thing that I inherited as part of a private collection or received via dropbox from cousin Joey, even though I might use different elements to recognize how I accessed the item or the "source of the source" references.

AdrianB38 2011-08-02T13:34:00-07:00
I don't know yet whether Geir tackles this issue, but it seems to me that we need to understand that the Source Group categories in the FTM screen shot are simply a mechanism to get to the appropriate template. And nothing to do with BG.

I can't actually work out the sequence of choices made in that FTM screen shot but if I wanted to reach the template for the (say) 1861 English census images from a commercial website, then, at the moment it _seems_ to me that I'd need to go to census first, then digital images, then ????

It DOESN'T matter which way I get there. I could click commercial website first, THEN, census, THEN... and I should end up in the same place.

So worrying whether the access path should be Census / Digital / England or Digital / Census / England is wrong. It should be both, so long as we end up with the same template. And the access path is nothing to do with the name of the template. It's just that the (human) name probably looks like one of the access paths.
GeneJ 2011-08-03T17:18:25-07:00
Geir's document touches on these points in several ways, and probably says it better than I will ...

His document recognizes that genealogical researchers are part of a larger global community that includes archivists, librarians, etc., and others who conduct research in different disciplines. Just as we want to establish genealogical technical standards "about sources" in an increasingly digitized day, other professionals are working to do the same thing.

As I see it, Geir's document lays the groundwork for us to recognize this larger standardizing community, so that we both participate in it's development and to gain from work already advanced.

In an oversimplified way, there are two or three main standardization concepts.

Metadata (or archival metadata).
Here's a link to the Wikipedia entry "Metadata Standards." See also a Library of Congress article, 2011, about MARC, and a little personal sketch about Jerry Simmons, an archival specialist at our National Archives (?2010).


For a host of sources that exist In a high or higher tech environment, archivists, librarians, etc. will have already begun identifying core metadata (the key information about a source). So, in some cases, when the user sets out to add a source, some information is already available. Today, for example, I routinely access metadata about published books from WorldCat.org--because librarians have been developing this technology for years. In the WorldCat.org example, the source has been previously declared a "book" or a "periodical."

So the point is, "metadata" exists for some sources and that base is growing.

Citation Styles
A second concept of standardization has to do with citation styles. Even if the world agreed on a set of data that was relevant for a given document, it's been said over and over again on the wiki that a standard shouldn't be tied to one style or another, but be broad enough to accomodate a variety of styles and customs.

To make that kind of transfer work, we need a good group of citation elements that could become available in all software and a better understanding (user to user) about what the citation elements mean/how they are used.

Once we have a handle on the citation elements (which I see as our version of the "metadata" fields), it's a matter of associating that data with styles. I see that effort as something we provide for, get started and then facilitate.

Bringing it all together ...
Only a subset of all sources will ever be coded with metadata by professionals. Where that data is available, it will often not equal precisely the information needed for a citation (it may contain more data and also not include identifying details some folks want to cite, especially in a full reference note).

As BetterGEDCOM, we would work to define elements that are associated with particular kinds of sources (Zotero calls them "item types"). I've spent a couple of days on this ... sort of thinking we might have one subgrouping level per item type.

Geir's proposal then talks about "modules." As I envision this, examples of core modules are repositories/Repositories/Privately held items/Image/web access, etc.

Separately, then we'd want to deal with the issues specific to genealogical software. For example, that larger standardizing community is unlikely to be interested in whether or not we have "citation mechanics" or users who tend to lump or split. (But then, paper based genealogists may not take much interest in those mechanics, either, as you have pointed out.)

Make sense? --GJ
AdrianB38 2011-08-04T06:59:16-07:00
Gene - what you say is fine, but, unless I'm missing something, doesn't address the point I was making in my last post.

That point was that worrying about Source Group categories as seen in the FTM screen shot was not a profitable thing to do. Whether you call them Census / Digital / England or Digital / Census / England or any of the 6 perms, doesn't actually matter so long as you end up with the same template.

Similarly there's no need to worry whether cemetery records should be classed as business records, church records, or their own category of cemetery records - stick the same templates in all 3 higher level groups so anyone can find them. It's only an access path and most places can be reached by several routes.
GeneJ 2011-08-04T10:26:23-07:00
Sorry ...

The objective is two fold. First, it should communicate what might be called the originating template. Second, you should just as easily be able to apply some other style to that same data.

If I'm transferring from self to self, I may want to stick to my presentations. If I'm transferring to the UK or Norway, that user might wish for other conventions.
gthorud 2011-08-04T14:24:10-07:00
The following is about what I think is discussed in the beginning of this discussion. I prefer to use the term “source” rather than “record” since it is well established Gedcom terminology.

Some of the source types listed in the first posting above are probably independent of country (e.g. book), but in general I think you will find many types that does not exactly match a similar source in another country, incl. source types that only exists in one country. I think it will be very difficult to harmonize most the majority of types across countries. My guess is that a good first criteria for selecting a source type is “Country” (probably the program will let the user select a default). If you try to use class/type (I have used class in the Architecture document) e.g. census, as the first selection criteria, you will end up with a long list to select from if you have all possible record types from several countries. If it turns out that there are many sources that have the same definition and citation elements internationally, you can define a country “international” and set the default to “My country + international”.

One could in theory let a Source Type Set (cf. Architecture …) apply to several countries, but if you do not cover all sources in all these countries, you may get other sets that cover the same (or some of) countries – that would not be a very tidy situation. I think users wants some order, and I don’t see a big benefit in having a set covering several countries vs having several sets. And I don’t see the big benefit in matching a few classes across boarders either.

A second issue is that various programs may have chosen different groupings of source types even within one country. So there may be different “decision trees” that you have to go through to select one record type depending on the program, all for the same set of source types. But, if I want my program to receive/download definitions of record types for “Far-away-stan”, that my program does not support, I also need info to direct selection among these source types. So there has to be a way to transfer the selection criteria. But, assuming that there are no dependencies among the criteria, it should not matter in what order the user goes through them – thus there could be several sets of selection criteria for a Source Type Set. But a program must be able to support any order, otherwise there is no use in being able to transfer selection criteria. If the program later implements support for these source types, it may design it’s own selection criteria for the source types– according to the programs way of doing things.

But it is a question if the steps (criteria) have to be the same (but may be in a different order). In my document I am talking about “Citation Element Modules” that can be tied to selection criteria. If we see any benefits in the modules, it may be difficult to have criteria that are different – not matching the modules. But there are also criteria that do not result in modules, one of them I assume is “Source Type Class” (e.g. census), where it might perhaps be possible to have several ways of classification for a “Source Type Set”.

I think work on template modules will show that concatenation all selection criteria, with all possible variants, into the name of the source type – as in RM – will create very long names, and a VERY long list of types. I don’t think that is a feasible solution.

(What if a program implements a selection of sources (not source types) based on these criteria – rather than just selecting them by title? Then you might be in trouble if you import sources from another program. But such a selection method may not be smart.)

An observation: It may be more difficult for users to create a user defined source type if they have to supply selection criteria.

I assume that selection criteria is not needed to import an citation data for an “unknown” source type. It will probably also be possible to import without knowing the definition of the source type – only the template – but I am not sure I have seen all aspects of this.
GeneJ 2011-08-04T15:23:40-07:00
I like the module concept very much, and assume these will reduce the length of selection lists.
GeneJ 2011-07-24T10:53:51-07:00
Principles ... Titles
Because I know this is everyone's favorite.

CMOS Online,14.15. "Titles of larger works (e.g., books and journals) are italicized; titles of smaller works (e.g., chapters, articles) or unpublished works are presented in roman and enclosed in quotation marks (see 8.161). Such terms as editor/edited by, translator/translated by, volume, and edition are abbreviated."

The simplified version:

Italics denote the title of a published work, say a book.

Quotes around the title denote the actual title of an unpublished work. Quotes are also used to denote the exact title of the chapter from a published work. (In both examples, we're using quotes because, err... we are "quoting" something.)

Generic titles (no italics, no quotations) appear without formatting. Generic titles appear frequently for photographs or otherwise untitled manuscripts.

Perhaps an interesting example. U.S. census were titled in sequence (First, Second, etc). It's pretty common for users, rather use published titles such as _ ... First Census of ..._ to refer to the census with a generic title, as below.:

1800 U.S. census....

Mills' comments about U.S. census generic titles at EE (2007), p. 265-266.
GeneJ 2011-07-28T07:49:25-07:00
In addition to Call Number, BetterGEDCOM should have unique fields for international standard identification numbers/identifiers.

ISBN - International Standard Book Number

DOI - Digital Object Identifier

ISSN - International Standard Serial Number