Since around 1980, the past has become accessible in everything that is now the internet. Not the complete past, mind you, but it’s been getting there. Every project to digitize works of literature, art, philosophy, religion and anything else from the past is serving to capture pre-internet into the digital present. And of course, there has been a tremendous amount of information generated and stored more or less simultaneously during the digital age from 1980.

Through the addition of AI, the digitization of the past is now becoming a synthesis of the past. What’s the difference? When you query the past using conventional, non-AI technology, you will get access to information that hopefully faithfully displays the same or similar facts hopefully from the same or similar sources. With current incarnations of AI-driven tools, the same query will result in an answer that is synthesized from the digitized historical past bounded by frameworks and protocols developed through repetitive trial and error, training and reinforcement. All of the source data has been in one way or another processed, digested, compressed and abstracted in ways that keep relationships intact and helps draw conclusions from the query rather that just pointing someone to a data source.

What?

AI-driven tools need to near instantaneously create answers to queries that are fundamentally natural language questions. You are not looking for data or information, you are looking for an answer to a question or a narrative on a topic if that what is required. In other words, don’t point me in the right direction, just tell me what I need to know. When any question can be asked in any context, all information must be accessible for the AI to do its work. And with current technology, the source data needs to be pre-processed, pre-structured and summarized in a way that conclusions, relationships and inferences can be drawn from the data. Something does get lost in the summarization and pre-processing. I wonder what it is?