Data Design Proposal

This is a con­sol­i­da­tion of the “Sec­ond Design” posts into a sin­gle doc­u­ment. It’s the same text, just put together into a sin­gle page for con­ve­nience. If I do any edit­ing, it will be to THIS document.

Read more…

Be the first to comment - What do you think?  Posted by admin - June 29, 2014 at 2:21 pm

Categories: Data Design   Tags:

Second Design 06

The User Experience

Our data design cen­ters around these features:

  • How/where does this item fit into the Strong geneal­ogy? Inter­nally we dis­cern this with the Dwight clas­si­fi­ca­tion number.
  • From this item, we can click to the Lang­behn Data­base, which allows us to explore the early Strong geneal­ogy and fam­ily relationships.
  • From a given point in the Lang­behn Data­base (or any­where else, for that mat­ter) we can bring up links to all related resources we have online with the SFAA.
  • We can cre­ate a mas­ter names list as the alter­nate way of explor­ing the above.

I don’t think we need to design in any advanced search­ing capa­bil­ity. That can come later as expe­ri­ence war­rants. The Lang­behn Data­base does have advanced search capability.

Read more…

Be the first to comment - What do you think?  Posted by admin - at 1:24 pm

Categories: Data Design   Tags:

Second Design 05

Other GEDCOM Data­base Files

If peo­ple sub­mit genealog­i­cal infor­ma­tion to us, they may be able to sub­mit it in the form of GEDCOM files. All per­sonal geneal­ogy soft­ware sup­ports data export to GEDCOM format.

Our TNG soft­ware can import any num­ber of fam­ily trees (GEDCOM files), includ­ing attached images and doc­u­ments if han­dled properly.

TNG’s advanced search capa­bil­ity can do searches cov­er­ing a spe­cific tree, or return results for all trees com­bined. Thus, rather than cre­at­ing a huge “mas­ter data­base,” I believe it makes far more sense to main­tain dis­tinct trees as sub­mit­ted. There are very strong rea­sons for not attempt­ing to com­bine or merge infor­ma­tion (but out­side the scope of this doc­u­ment). TNG allows us to keep things intact as sub­mit­ted, but do com­bined searches over all trees at once.

Read more…

Be the first to comment - What do you think?  Posted by admin - at 1:10 pm

Categories: Data Design   Tags:

Second Design 04

The Man­u­scripts in Progress

The SFAA His­to­ri­ans worked on updates to our pub­lished books, from 1990–2004. I have most of this work in the form of Microsoft Word doc­u­ments. These doc­u­ments are orga­nized the same way as the pub­lished books. That is, each sep­a­rate Word doc­u­ment cov­ers descen­dants of a spe­cific per­son. Each can be tied to one spe­cific Dwight num­ber and person.

Because these are com­puter files, I believe that I can mine each doc­u­ment for names. I believe I can catch the name at the begin­ning of each for­mat­ted para­graph, and cap­ture an excerpt such as the para­graph itself.

Since each of these doc­u­ments was typed by hand, there is a lot of “fuzzy logic” involved in suc­cess­fully cap­tur­ing the names. Thus this infor­ma­tion would be added to the data­base slowly, over time.

Read more…

Be the first to comment - What do you think?  Posted by admin - at 12:44 pm

Categories: Data Design   Tags:

Second Design 03

The Printed Volumes

This project is not about the already-published books. How­ever, there is a sig­nif­i­cant rev­enue oppor­tu­nity which can be sup­ported by a sim­ple piece of the data design.

Our arti­fact is the Table of Con­tents of each book. Each Table of Con­tents is 1–3 pages long and eas­ily tran­scribed into a spread­sheet (for exam­ple). Here is an excerpt from Vol­ume Three:

THE JOHN STRONG, JR. LINE (below is Dwight #, the per­son, and page num­ber in this book):

  • #3, John Strong, Jr., Page 1
  • #39, Han­nah Strong, Page 3
  • #41, John Strong, Page 12


  • #15, Abi­gail Strong, Page 199
  • #23954, Abi­gail Brewer, Page 199
  • #24042, Nathaniel Chauncey, III, Page 215

and so on.

Read more…

Be the first to comment - What do you think?  Posted by admin - at 12:25 pm

Categories: Data Design   Tags:

Second Design 02

The Lang­behn Data­base, Again

Let’s look at The Lang­behn Data­base from a data design perspective.

The Langehn Data­base is rep­re­sented in a col­lec­tion of tables defined by TNG (The Next Gen­er­a­tion of Geneal­ogy Site­build­ing). The TNG prod­uct site is here: The soft­ware itself is well sup­ported with an extremely active mail­ing list.

As part of our data design, I intend to add a link to each TNG-generated page show­ing our other resources related to that per­son. The infor­ma­tion could also be part of a tooltip popup. The imple­men­ta­tion details are out­side this data design; I’ve done this sort of thing with TNG before.

What we DO need with this data design, is pro­vi­sion for an SQL query pro­vid­ing that infor­ma­tion, given the rel­e­vant Dwight ref­er­ence number.

Read more…

Be the first to comment - What do you think?  Posted by admin - at 12:10 pm

Categories: Data Design   Tags:

Second Design 01

There is a sec­ond area which might be a good place for for­mal Data Design. This area is not time crit­i­cal, and is sep­a­rate from the paper doc­u­ment solu­tions being dis­cussed else­where. This design has the aim of rev­enue generation.

I will describe the spe­cific arti­facts which cur­rently exist, and work towards the end user expe­ri­ence. Addi­tional back­ground mate­r­ial is at:

Read more…

Be the first to comment - What do you think?  Posted by admin - at 11:46 am

Categories: Data Design   Tags:

Complete Design Change: SFAA Perspective 02

I have been think­ing in terms of a home-grown solu­tion. I now find (via pri­vate cor­re­spon­dence) that a com­pre­hen­sive paper-document-management solu­tion may exist.

Where does that leave us? I’ll lay out some thoughts for discussion.

Read more…

Be the first to comment - What do you think?  Posted by admin - at 11:02 am

Categories: Data Design   Tags:

SFAA Perspective 01

The inten­tion of this project is to make use of the His­to­rian Archives. I’m tak­ing what I think is a Busi­ness Intel­li­gence approach. I’m try­ing to pro­duce results that are use­ful to peo­ple from the moun­tain of infor­ma­tion. The ideal would be to pub­lish a dozen new vol­umes of geneal­ogy, but that’s just not practical.

As the first step, we con­nect the Seven-Volume Index and the Lang­behn Data­base. That is a great out­come and within cur­rent capa­bil­i­ties. It will take some time, but it’s feasible.

As a sec­ond step, we con­nect all other sub­mit­ted GEDCOM files. This is a huge gain. Before now, I just have not known what to do with sub­mit­ted infor­ma­tion. This is because I’ve been think­ing in terms of cre­at­ing a coher­ent book or some elec­tronic equivalent.

Now, we’re think­ing more in terms of a search engine. Browse around and make con­nec­tions. I’m not sure where to take it from there, but that’s a great start.

Read more…

Be the first to comment - What do you think?  Posted by admin - June 28, 2014 at 3:19 pm

Categories: Data Design   Tags:

Project Design 04

This project is intended to be of ben­e­fit to the SFAA, and it is intended to be fea­si­ble with avail­able resources. It’s time, there­fore, to see if I can artic­u­late those ben­e­fits. I par­tic­u­larly need to be able to explain how we can make use of vol­un­teer help.

So. How do we put this project in prac­ti­cal terms?

Read more…

Be the first to comment - What do you think?  Posted by admin - at 2:49 pm

Categories: Data Design   Tags:

Next Page »