In March nearly 7,000 people traveled to the Salt Palace Convention Center in Salt Lake City, Utah, to spend the weekend at RootsTech, a yearly technology-focused genealogy conference sponsored by FamilySearch and a few other big names in the family history industry. Genealogy — the search for and documentation of one’s ancestors — and “technology” haven’t always been kissing cousins, but this conference speaks to and encourages a growing relationship between the two. The hobby, traditionally picked up near retirement age and most often by women, is now a billion-dollar industry with a growing younger demographic.
In the past few years, finding and charting one’s family history has become trendy because it’s also become a lot easier to get started. Companies like Ancestry.com and FamilySearch have spent the last decade or so making all of their tools, records, and data available on the internet, revolutionizing genealogical research — and significantly lowering the barrier to entry in the process. What was once a pastime for older people or professionals with disposable income is quickly becoming a more mainstream pursuit. Taking a peek into the past now requires nothing more than a decent internet connection and a laptop. DNA testing, which just a few years ago cost thousands of dollars and offered little information for genealogists, is now a growing consumer option, reaching back hundreds of years to provide undreamed of amounts of information about our ancestors.
Genealogy’s next phase, which is quickly approaching, is actually its end game. The massive accumulation, digitization, and accessibility of data combined with recent advances in DNA testing mean the questions we have about our families — who they were, how they got here, and how they’re related to us — will soon be instantly solvable. Realistically, the pursuit of family history as it exists now probably won’t be around in 20 years: most of the mysteries are disappearing, and fast.
So, who are we? How did we get here? Where did we come from? And where are we going?
A tree grows in Provo
Ancestry.com’s headquarters are nestled at the foot of a mountain in Provo, Utah, just 45 miles from downtown Salt Lake City. Founded in 1990 by Paul B. Allen (not the Paul Allen of Microsoft) and Dan Taggart (both Mormon graduates of Brigham Young University), the company was initially known as Infobases, and distributed Latter-day Saints (LDS) publications on floppy disk. From its earliest days, Ancestry.com was a software company, selling disks of LDS archives for around $300 out of Allen’s car. By 1995, the two were putting their wares online. Ancestry.com quickly became a leading destination for online genealogical research. Though the tools to create online family trees, indexes, and records were free, actual scanned images of historic documents were behind a paywall, and the company made much of its money through subscriptions. Today, an all-access membership to the service costs $298 a year — around $35 a month — for its over 2 million paying subscribers.
Years ago, the information would have filled a shelf of handwritten binders
The company’s extensive records mean anyone can construct a family tree. Once you find a record — say a 1940 census image — that you believe has your grandfather's name on it, you can link that record to his name in the tree. Your tree can be private, or you can share it and link your tree to others’. It's a powerful, centralized place where almost all of your research can be consolidated. Years ago, the same amount of information would have filled a shelf of handwritten and photocopied paper binders and wouldn’t have been easily shared.
is the most recognizable brand in genealogy today
Tim Sullivan, Ancestry.com’s CEO, gave a keynote address at RootsTech. He has been with the company for 10 years. Before that he had been first the COO, then the president of Match.com. He’s also worked for TicketMaster and Disney. Under Sullivan's stewardship, Ancestry.com has become the most recognizable — and probably also the most successful — brand in genealogy today. Still, he is approachable in a way that most CEOs are not and, as we walk through the halls of the convention center looking for a quiet place to talk, people wave, smile, and occasionally approach him. "Search, that’s what the past five years have been about," he says. But now, he says, "Family history is truly social." People work together — whether they know it or not — to improve their own personal trees, and to improve Ancestry.com’s data, by "stitching together" their information.
Sullivan is right that the company’s early success hinged almost entirely upon search. Ancestry.com provides an unmatched and ever-improving search algorithm. A generic search engine such as Google can’t distinguish between, say, a first and last name, which can mean all the difference in this kind of work, especially if your ancestor’s first name was something common like "Smith" or "Taylor." But Ancestry.com (and other companies like it) has built a search engine with a specific, single-minded purpose. It can handle, in one request: a first name associated with a last name (including a vast array of alternate spellings); a range of dates; a specific or broad range of documents to search; a geographic location as broad as a country or as specific as a town; a number of birth dates; a birth location; and additional names such as those of a relative’s children. The engine — which processes around 45 million searches a day (Google sees around three billion) — isn't perfect, but it is very powerful, and it’s constantly being tweaked and upgraded. The search can return hundreds of results within these specifications, ranked according to how well they match. By mining its massive database of archives and connections, Ancestry.com also automatically delivers "hints" — denoted by a shaking, illustrated leaf — based on your family tree that point to potential relatives and primary sources. Recently, it also debuted a Facebook sharing feature, where you can link yourself and your family’s Facebook accounts to profiles in your tree, too. This has also increased the power of Ancestry.com’s search.
All God’s children
Interest in one’s ancestors dates back as far as history itself, but for most of mankind’s time on earth, the study of kinship and descent was reserved for royalty and the super-rich, usually with the aim of consolidating power and wealth. Modern hobbyist genealogy as it is practiced today, however, has its roots in the founding of the New England Historic Genealogical Society in Boston, Massachusetts in 1845. The Society popularized the system of charting relationships on "trees" as originally developed by John Farmer in the 1820s and still practiced today. In the next decade, a similar society was founded in New York, and it became common to seek ties to the Founding Fathers and other Revolutionary War figures. "When the Daughters of the American Revolution started up, and The Mayflower Society," Thomas MacEntee, a genealogist well known throughout the hobby’s large and growing online community, says, "that's really what I call the first phase of genealogy." As the American Republic was born, so was genealogy in the US.
Less than 50 years after the founding of the New England Historic Genealogical Society, what would arguably become the most important player in the genealogy game came into existence, way out west in Salt Lake City, Utah.
Thomas MacEntee says that Salt Lake City is the "Mecca" of family history research. This is because it is also the home of The Church of Jesus Christ of Latter-day Saints and its Family History Library, founded in 1894 as the Genealogical Society of Utah. It is the largest library in the world dedicated to genealogy. Its online portal, FamilySearch, sees about 10 million visits a day.
One of the Mormons’ fundamental tenets is doing genealogy
Salt Lake City was founded by Brigham Young and several other Latter-day Saints in 1847, and now has a population of just over one million, about half of whom are members of the LDS Church. Donald Anderson, the senior vice president of patron and partner services at the Family History Library, says that the Mormon church believes in "eternal families," and in the ability of those families to "continue beyond this life." So identifying ancestors, is, he says, a "significant part of the doctrine of the church." Standing in-between giant banks of filed microfilms, he says, "We’re all God’s children."
One of the Church's fundamental tenets is doing genealogical research because its members believe that Mormons can baptize ancestors in their absence. The act of baptizing family by proxy — i.e., without the knowledge or permission of the ancestor, usually because they're deceased — has been fairly controversial, but it’s not a focus for most genealogists. FamilySearch and The Family History Library’s staff welcome Mormons and non-Mormons alike. That’s because the library’s usefulness reaches far beyond its own religious goals, and the Latter-day Saints believe in spreading their information far and wide, all free of charge.
The Family History Library is an angular, jarringly modern building, open to the public six days a week. All of its services are free. Many — if not most — of its half-million yearly visitors are hobbyist or professional genealogists with no Mormon affiliation. They come because the Family History Library has amassed the largest collection of documents, books, and microfilms related to genealogical research in the world.
Beginning in the late 1930s, the Latter-day Saints undertook a massive project, finding and microfilming genealogical records on a global scale. Using an army of volunteers and missionaries, the LDS visited governments and churches (where most vital records were kept until the turn of the 20th century) all over the world, amassing 2.4 million rolls of microfilm. The Family History Library also operates 4,600 volunteer-staffed Family History Centers worldwide; these are smaller research facilities where patrons can order microfilms and books from the main library and have them delivered for off-site work close to home. But the days of digging in dusty libraries (the Family History Library is state-of-the-art, and not, in fact, dusty at all) for long-forgotten and yellowed documents are quickly coming to an end, thanks to the internet.
So you want to make a family tree... now what?
"It seems like the internet was invented for genealogy."
Until the internet came along, researching your family was a grueling and often unrewarding process. If, like many people, you were starting from scratch — maybe you know your four grandparents’ names and not much else — beginning could be nearly impossible. Thomas MacEntee got his start in the ‘70s and says, "You had to go down to an archive or repository" straight away. Luckily for him, he went to college in Washington, DC, home of The National Archives, which house an immense collection of United States census and military records. "It was all very paper-based," he says. The records were either paper or microfilm, and access required travel or, failing that, a mail order. The records typically weren’t indexed, either, so you had to know exactly what you wanted: a tall order if you were looking for a great-grandparent’s death certificate without knowing what day they died. All the charting of the family tree, of course, was paper-based too, so it was also often hard to figure out the relationship between, say, one cousin and another. Until the mid-2000s, almost none of this information was readily available online. Today almost all of it is, with the exception of some vital records (state laws determine their availability), and many military service files.
Katie’s Family Tree
Katie Notopoulos is just the kind of genealogist that these advances in technology have made possible. A self-described "hobbyist genealogist" and editor at BuzzFeed, Katie says she got started about five years ago when a friend told her how interesting and fun it was. She doesn’t go to genealogy conventions or scour cemeteries for missing dates on tombstones. "I do all of my research online," she says. This simply wasn’t possible until a few years ago.
To construct a family tree, one had to be a historian, a detective, and a linguist
"Genealogy is an industry that I think has lagged behind technology," MacEntee says, likely because the record holders — small churches and local governments — didn’t have the funds to microfilm and exhaustively catalog (and later, digitize) their records. Often small county courthouses have just one person dedicated to processing requests for family history records. Progress in digital photography, scanning software, and OCR (optical character recognition) technologies have only recently brought down costs.
In the early days, constructing a tree could be a lonely road, often with just an overworked librarian or archivist here or there to help you make sense of your findings. In order to be successful, you've also historically had to possess a broad, working knowledge of geography, history, world events, and immigration patterns over the last 200 years. A lot of that has changed.
Katie has constructed, for the most part, a tree reaching back about five generations, which includes British, Greek, and German ancestry. She has mostly accomplished this using Ancestry.com. "Very early on," she says, "I had a breakthrough where I found someone else who had been doing research for years and years on a branch of my family." Finding another person — however distantly related to her — to work on the same project with, she says, "Was immensely helpful and made it a lot easier to quickly trace far back, which all seemed very romantic and exciting." She likes to do her research in tiny chunks of time; sitting on her couch, watching TV. "What makes it a good hobby," she says, is that she can solve "mini-mysteries." It's an activity that takes her away from the "real world, temporarily." She even found a third cousin who was researching one of her family lines on Ancestry.com, and met them for dinner when they visited New York. This is a regular phenomenon in the online world of ancestral research.
"We’re all related" is a sentiment you’ll often hear
"We're all related" is a sentiment you hear often in the genealogy community, and it's not completely untrue: go back just 10 to 15 generations and many of us will find common links. But not until the internet was that widely held wisdom actionable in any useful and organized way. "It seems like the internet was invented for genealogists," Thomas MacEntee says. He's at his birthday party in a ballroom at the Radisson in Salt Lake City, and it's the third night of RootsTech. It's a large, friendly gathering, and there are even some celebrities of the genealogy world in attendance: Cyndi Howells, who invented Cyndi's List, is there, along with people who work for FamilySearch and the Israeli company MyHeritage. Thomas knows everyone, and the community is tight-knit, meeting up a few times a year at conferences.
The rest of the time, though, they're online, networking and helping one another dig. At its core, genealogy's draw is in the hunt, in searching — sometimes for years — for a clue that holds the key to another ancestor. The search is, of course, essentially infinite: most people are lucky to get back five or six generations, at which time their tree will contain upward of 5,000 people.
In the earliest days of the internet, the best place for genealogists to meet up was on email lists and message boards, where they pooled resources and helped each other look up newspaper clippings or birth records. In 1984 the Latter-day Saints published the open standard format for a genealogical file, known as the GEDCOM. A simple plain text file with metadata linking records to one another, GEDCOMs can be read by different types of proprietary software and remain the standard file format even today. This meant that people could begin to share large amounts of information — their findings, their families — with one another in an easy and portable way online. Small websites focusing on single families or on compiling lists of obituaries in one small town popped up all over the internet. Some people walked entire cemeteries, documenting every headstone, painstakingly transcribing them and putting them online. What was an essentially data-driven hobby wasn’t going to lag behind forever. "We knew it would catch up, eventually," Thomas says. And it did. By the mid-90s, small startups began to see that the internet could mean big business for genealogy, and the Latter-day Saints were taking notice, too.
The search continues
Search is the key. Scott Sorensen, the CTO of Ancestry.com, explains that each search result is tied to a series of metadata — an index, or basic information such as the person's name. It’s also usually tied to a high-quality scan of a document which may be hundreds of years old. "We've got 10 billion records, four petabytes of data," he says, tied to the results returned from a search. Any search might dig up between 10 and several hundred results, weighted according to how well they match your terms. And with each search, the engine improves: "All of the interactions that our customers have with our site, we're able to learn from those." Using "machine learning technology, we can observe customer behaviors in the aggregate and over time learn and improve our algorithms, because they continue to add structure to our data" he says. The Ancestry.com users, he says, "Continue to make judgements about the records, which we're able to learn from." Finally, the indexes and records are tied, through customer interaction, to 38 million separate user-generated trees, which can further be linked to one another in an ever-expanding giant matrix of data representing peoples’ families.
So where are the records coming from? Unsurprisingly, many of them come from the Latter-day Saints' Family History Library. Ancestry.com forges agreements with companies large and small, granting access to the valuable records behind its paywall. Their data is particularly useful since the LDS "got there first," in many cases — for instance, by microfilming census data. FamilySearch CEO Dennis Brimhall says that because the organization is a nonprofit (as an arm of the church), it’s easy for it to share records. "We’re just interested in people finding records," he says. "We hope that works with their financial model. It probably works with ours because we don’t really have a financial model, but what we really want to do is make more records available to more people throughout the world." This thinking drives most of the companies in the genealogy business: access is key, regardless of who owns what, so companies share their data rather than force each other to "duplicate efforts" by digitizing redundant copies. Ancestry.com also has relationships with the non-profit JewishGen, the largest destination for Jewish genealogy, and Find A Grave, the most comprehensive user-generated database of cemetery transcriptions in the world. Some of these relationships give users direct access to the records and data without ever leaving Ancestry.com’s portal; some, such as the indexes of vital records and censuses from the United Kingdom, allow users to see names and other basic information. If access to the actual images is required, however, users have to go to the site, which controls them directly.
But Ancestry.com is also actively buying. "We spend over $20 million a year acquiring new content," Scott Sorensen says. On the day of our visit to a clean laboratory, employees are using digital cameras and proprietary software to create high-resolution scans of high school yearbooks. "Yearbooks are incredibly important for genealogy," Thomas MacEntee says, "because they are a great source of finding maiden names for women," a tricky problem when married women often take their husband’s surname. Ancestry.com has acquired a huge number of similar "secondary" sources such as city directories, phone books, and church directories. Once the images are scanned, names are transcribed, metadata is embedded, and the images are uploaded with indexes to Ancestry.com’s website. The company has also bought several other genealogy and archive businesses — smaller competitors — in order to bolster its resources. In April of 2012 it paid $100 million for Archives.com, and that October acquired the photo digitization service 1000memories.
FamilySearch has a website with very similar capabilities, where everything is free. The search isn’t quite as powerful as Ancestry.com’s, and their family-tree making software is not as robust, but the massive collection is growing literally by the day. This growth is fueled by over 150,000 volunteer transcriptionists using a proprietary Java application the company developed itself. Anyone at home can download the app, and in a few minutes transcribe a series of birth, death, or marriage records. This process, called "Indexing," is one of FamilySearch’s most prized and valuable tools. Each year, using its sophisticated transcribing and indexing system, FamilySearch adds 400 million indexed, organized images to its website. The company — which used to distribute its records via microfilm and CD-ROMs — can now move incredibly fast to make its data available to genealogists. The process, from capturing images in the field to making the records available to customers, used to take about 18 months. Now it’s usually less than two, and of course — it’s online, not on rolls of microfilm.
Family history is big business
Ancestry.com and FamilySearch may be the biggest names in online genealogy, but they’re not the only ones by far, and the newer players are moving fast to try to eat up a piece of the growing market. MyHeritage is an Israeli company founded in 2003 whose service operates more like a social network for family members — both the living and the dead — than traditional family tree approaches. The site recently raised $25 million in funding, and is available in 38 languages. Because its early focus was on places such as eastern Europe — where Ancestry.com’s holdings are somewhat weaker — MyHeritage arguably offers something quite unique to American audiences, where it is now making an aggressive push into the market. UK-based FindMyPast.com is also making headway in the American market because exclusive relationships with the governments of England, Scotland, and Wales have essentially given it a monopoly on vital records in those countries. FindMyPast.com’s CEO Chris van der Kuyl is also the president of 4J Studios, the company responsible for bringing Oblivion to the PlayStation 3 and Minecraft to the Xbox. He describes himself as a "technology geek" and thinks about genealogy from that perspective. He got into the family history business by accident when a friend asked him to apply some of his user-experience building skills to a genealogy company’s software. Five years later, he’s still here, working at the helm of the UK’s most powerful family history source. "Technology is empowering," he says, "and the more people [who] have access to the right technology and bring their own data and their own experiences, the more exponentially things get better for everyone. Our mission is to create the most amazing family history experience, to give as many people access to their own story as possible."
By 2010, Ancestry.com had forged a relationship with NBC to bring the UK television series Who Do You Think You Are? to mainstream US audiences. The show featured professional genealogists paired with celebrities like Sarah Jessica Parker, Steve Buscemi, and Spike Lee, sending them on journeys to find the stories of their ancestors. In October of 2012, Ancestry.com — then a publicly traded company — was acquired by a group of investors including CEO Tim Sullivan and European private equity firm Permira Advisers LLP for $1.6 billion dollars. Family history is big business, sure, but searching documents online isn’t the only way to figure out who you are. If you want to get serious and look to the future, well then you’re going to have to spit.
Spitting image: DNA can solve this for you
In Mountain View, California, just around the corner from Google, rest the unassuming headquarters of 23andMe. The company was founded in April of 2006 by a small group including biologist Anne Wojcicki, who is now married to Google cofounder Sergey Brin. 23andMe was created as a personal genome company, its main goal being to put the "power of a person’s health into their own hands," says Catherine Afarian, the company’s public relations manager. While that sounds like a simple mission, it was unheard of just a few years earlier.
Katie recently took the 23andMe DNA test, as well an AncestryDNA test. She did so, she says, because she was curious to see how well science matched up with what she had found in her research. She simply signed up for an account on the website, ordered the DNA test, spit into a tube once the test had been shipped to her, and then registered its barcode number on the website. About eight weeks later, personalized ancestry and health results showed up in her inbox.
At launch, the test cost $999. It was fairly cheap, all things considered, but not something that everyone could afford. This past December, after announcing it had amassed a database of 180,000 DNA tests, 23andMe permanently lowered the cost of the test to $99 in the wake of a large round of financing, and set its sights on getting 1 million tests in its database this year. Though just two percent of Americans have taken such a test, a study conducted by 23andMe indicates that nearly 71 percent of those who haven’t taken one are interested in doing so.
How DNA is inherited
Both Ancestry.com and 23andMe’s genealogical DNA results have similar features. Once your results have been processed — both companies send their tests out to a lab for extraction, then do in-house analysis — you can log into your account and see an approximate composition of your ancestral DNA, which dates back around 500 years. For example, if your grandparents were half Polish and half Irish, your DNA results wouldn’t necessarily reflect that closely, but they would show you roughly where your family came from 10 generations ago. The results for both tests are displayed in a map format (as seen in the diagrams above). "It’s a little bit confusing," Katie says, "because the Ancestry.com test shows that I have about 17 percent Scandinavian DNA, and I haven’t found any Scandinavians in my own research." This opens new, often previously unconsidered, territory for a genealogist to pursue.
Though 23andMe delivered some ancestry results at launch, its "Ancestry Composition" feature — which delivers fairly specific, sophisticated information based on 22 worldwide populations — was launched in August of 2012, just three months after Ancestry.com launched its new DNA testing feature. Both 23andMe and Ancestry.com now offer the same type of test: an autosomal DNA test which delivers specific ancestry information for anyone. Earlier tests for females tested only MtDNA, and delivered only ancestry results from their mothers: a much less specific and useful type of test. The release of a more powerful test by both companies, and the subsequent decrease in cost, means that many people are now signing up. Ancestry.com announced in March that its database comprises of more than 120,000 DNA tests.
But how is the analysis done? Unsurprisingly, it’s complicated and, according to 23andMe’s senior director of research, "not very interesting." Basically, your DNA is tested using several hundred "markers," and compared using the "signal" those markers share strongly in common with geographic populations worldwide. Some markers have a very strong association with a specific location, making the results much more reliable, while others — such as those associated within central Europe, France, and Germany — are much less so, making that fine of a distinction often difficult to assume with a high level of accuracy. The process is further complicated by the fact that most people living today have multiple ancestries, as populations have inevitably migrated and mingled over the course of centuries.
So, if you take the spit test, your DNA is then compared to a set of "reference" tests — the DNA of thousands of people whose origins are well-documented and mapped to geographic locations. In the simplest of terms, where your DNA matches with those reference sets of data, a percentage of your ancestry can be extrapolated to be from that region, too.
A secondary, and possibly more powerful feature of both 23andMe and Ancestry.com’s DNA sites provide something else entirely through similar methods of comparison: they show you other members who have taken the same test who are likely related to you. Both sites give a percentage of confidence in the match, so an example match might say, "We are 95 percent confident that member X is between a fourth to sixth cousin." Now, a sixth cousin is pretty far back in your family tree, but a second or third cousin (and many people who take the test, Ancestry.com tells me, have one or two matches at that level of closeness), is not. A second cousin is the child of your first cousin (your parent’s sibling’s child); a third cousin would mean that you and the other person share great-grandparents. On average, 23andMe says that each person who takes their test has more than 1,000 genetic matches found in the database. You have the option of contacting them — first anonymously — to compare information. Obviously, the more people who take the test, the more matches will be found, and the accuracy of those matches will increase too; hence the big push from both Ancestry.com and 23andMe to get many, many people to take their tests. It also helps explain the recent deep discount — in both cases permanent — to $99.
We’re approaching a future where the mysteries of our ancestral past will simply no longer exist
"We can create a whole new market of people who can make family history discoveries without having to do original research in old historical documents," Tim Sullivan says, calling the recent developments a "revolution in personal genomics." Ancestry.com links your DNA test directly into your tree, and 23andMe offers a less robust feature where you can upload a GEDCOM file to the site, also linking your data to a tree. Personal DNA tests for genealogy aren’t yet widespread, and they haven’t yet realized their full potential. But that’s not very far in the future.
So what does that mean for people who just love digging through documents, whether online or off, searching for tiny clues to link them to the past? In the shortest of short terms, the search continues. But realistically, in the next five to 10 years, it will become increasingly simple to find out who your ancestors were even several generations back, with relatively little effort: genealogy questions are a problem that technology is going to solve, and the foundation has already been laid. And a bit further in the future, it’s entirely realistic to believe that those questions of bloodline, like "Who was my great-grandmother?" simply won’t exist. The role that social networks like Facebook play in laying the groundwork for the future documentation of relationships is an important one — we’re all making more data than ever before. It’s not hard to imagine a future where the mysteries most of us have in our ancestral past will simply no longer exist.
The "stories," they say: that’s what all this data leads us to. A link to our past, not just in birth certificates, dates, and names on a chart, but through stories about who we are via who came before us. In the past decade, genealogy as a hobby has grown exponentially because of the vast amount of searchable data accumulated online: by companies like Ancestry.com, by the government, and by individuals. That trend will only accelerate in the coming years, making the research far more accessible for people with limited time or resources. "For me," Katie says, "I quickly moved past any sense of ‘these are my relatives’ and just fell in love with discovering these completely regular lives from the past, and learning history. I can't ever imaging thinking, "Well, I've found out everything I wanted to know; that's a wrap." Unfortunately for those who love the hunt, the future is about to get way less mysterious.
Video by Billy Disney, Stephen Greenwood, Ryan Manning, and Sam Thonis
Design by Scott Kellum
Infographics by James Chae and Scott Kellum
Photo credits: National Archives and Records Administration, and Katie Notopoulos