Frederic Kaplan How I built an information time machine

Φανταστείτε να μπορούσατε να σερφάρετε στο Facebook … από το Μεσαίωνα. Ίσως να μην είναι τόσο απίθανο όσο ακούγεται. Σε μια διασκεδαστική και ενδιαφέρουσα ομιλία, ο ερευνητής και μηχανικός Frederic Kaplan μας δείχνει την Βενετική Μηχανή του Χρόνου, ένα έργο ψηφιοποίησης 80 χιλιομέτρων βιβλίων για τη δημιουργία μιας ιστορικής και γεωγραφικής εξομοίωσης της Βενετίας κατά τη διάρκεια 1000 ετών. (Βιντεοσκοπημένο στο TEDxCaFoscariU.)

The translation was done by Dimitri Frangoyanni and the editing by Chryssa Rapessi for TED Talks.

This is an image of the planet Earth. It looks very much like Apollo's photographs that are well known. There is something different, you can click on it and if you click on it, you can zoom almost anywhere on Earth. For example, this is a panoramic view of the EPFL campus.

In many cases, you can also see how a building looks from a nearby street. This is awesome. But something is missing from this wonderful tour: Time. I'm not sure when this photo was taken. I'm not even sure if it was pulled at the same time as the panoramic image. In my workshop, we create tools to travel not only in space but also in time. The question we are asking is is it possible to make something like Google's past charts?

Can I add a slider on Google Maps and change only the year, seeing how it was 100 years ago, 1000 years ago? Is that possible? Can I rebuild the social networks of the past? Can I bark a Facebook of the Middle Ages? So can I build machines of the time? Perhaps we can just say, "No, it is not possible". Or to think about it from the point of view of information. That's what I call the mushroom of information.
Vertically, you have time, and horizontally, you have the amount of digital information available. Obviously, in the last 10 years, we have a lot of information. And the further back we go in the past, the less information we have. If we want to make something like the Google maps of the past, or the Facebook of the past, we have to make this space bigger, we have to make it rectangular. How do we do this? One way is digitization. There is a lot of material available — newspapers, printed books, thousands of printed books. I can digitize all of that. I can extract information from them. Of course, the further back we go, the less information we have. So it might not be enough. So I can do what historians do. I can extrapolate. This is what we call extrapolation in computer science. If I take a diary, I can consider, that it is not only a diary of a Venetian captain who makes a certain taxiδι. Μπορώ να θεωρήσω ως ένα ημερολόγιο αντιπροσωπευτικό πολλών ταξιδιών εκείνης της περιόδου.
Stick out. If I have a table of a facade, I can consider that it is not only the specific building, but probably has the same grammar as buildings for which we have lost some information. So if we want to build a time machine, we need two things. We need very large files, and we need excellent experts. The Venice Time Machine, the program I'm going to talk to you about, is a joint project between EPFL and Ca'Foscari University of Venice.
There is something very strange about Venice that its administration was too bureaucratic. They have recorded everything, almost like Google today. At Archivio di Stato, you have 80 mileage records that record every aspect of life in Venice for over 1.000 years. You have every ship that comes out, every ship that comes in. You have every change that has been made in town. It's all there. We are setting up a 10 year digitization program designed to transform this vast file into a giant information system. The kind of goal we want to achieve is digitizing 450 books a day. Of course, digitization is not enough, because these records are mainly written in Latin, in Tuscany in the Venetian dialect, so we have to copy them, translate them in some cases, register them and that is obviously not easy.
Specifically, the traditional optical character recognition method that can be used in printed texts does not work well in manuscripts. So the solution is to be inspired by another area: voice recognition. An area that may seem impossible, but that can be done simply by putting additional restrictions. If you have a very good model of the language used, if you have a very good model of a document, how well structured it is. And these are administrative documents. They are well structured on many occasions. If you divide this huge file into smaller subsets where a smaller subset shares similar elements, then there is a chance of success.
If we reach this stage, then there is something else: we can extract events from these documents. And in fact, about 10 billion events can be extracted from this file. and this giant information system can be searched in many ways. You can ask questions like, "Who lived in this palace in 1323?" "How much was a fagri on the Realto market in 1434?" "What was the salary of a glassmaker in Murano over the course of perhaps a decade?" You can ask even longer questions because they will be semantically encoded. And then you can put them in space, because a lot of this information is spatial. And from that, you can do things like reconstructing this surreal journey of that city that managed to have a sustainable development for over a thousand years, managing to have a form of balance with the environment all that time. You can reconstruct this journey, visualize it in many different ways. But of course, you can't understand Venice by just seeing the city. You have to put it in a larger European background. So the idea is also to record all these things that were working at the European level. We can also reconstruct the journey of the Venetian maritime empire, how it gradually controlled the Adriatic Sea, how it became the most powerful medieval empire of its time, controlling most of the sea routes from east to south. But you can still do other things, because in these sea routes, there are repeating patterns. You can go a step further and create a simulation system, create a Mediterranean simulator that is able to reconstruct even the missing information, which would allow us to ask questions like if you were using a route planner. "If I am in Corfu in June 1323 and want to go to Constantinople, where can I get a boat?" We can probably answer that question to within two or three days. "How much will it cost;" "What's the chance we'll encounter pirates?"

Φυσικά, καταλαβαίνετε, η κεντρική επιστημονική πρόκληση ενός τέτοιου έργου είναι ο χαρακτηρισμός, η ποσοτικοποίηση και η παρουσίαση της αβεβαιότητας κι της ασυνέπειας σε κάθε βήμα της διαδικασίας. Υπάρχουν παντού σφάλματα, σφάλματα στο έγγραφο, είναι λάθος το όνομα του καπετάνιου, κάποια πλοία δεν βγήκαν ποτέ στη sea. There are errors in translation, interpretive biases, and on top of that, if you add the algorithmic processes, you have errors in recognition, errors in extraction, so you have very uncertain data. So how can we discover and correct these inconsistencies? How can we present this kind of uncertainty? It's hard.
One thing you can do is to record every step of the process, not only by codifying historical information but what we call the post-historical information, how to construct historical knowledge, documenting each step. This does not guarantee that we are converging on a unique history of Venice, but we may be able to reconstruct a fully documented history of Venice. There may not be a single map. There may be several maps.
The system must allow this, because we have to deal with a new form of uncertainty, which is really new for this kind of giant baseς δεδομένων. Και πώς θα έπρεπε να επικοινωνήσουμε αυτή τη νέα έρευνα σε ένα μεγάλο κοινό; Πάλι, η Βενετία είναι αξιοσημείωτη για αυτό. Με τα εκατομμύρια επισκεπτών που έρχονται κάθε χρόνο, είναι πραγματικά ένα από τα καλύτερα μέρη για να προσπαθήσουμε να εφευρέσουμε το μουσείο του μέλλοντος.
Imagine, you see horizontally the reconstructed map of a given year. and vertically you see the document that served the restructuring, tables for example. Imagine an immersion system that allows us to dive and rebuild Venice of a given year, an experience you can share within a group. Conversely, imagine starting from a document, a Venetian manuscript, and showing what you can make of it, how it is decoded, how the document box can be recreated. This is a picture of a report currently being held in Geneva with this kind of system.

So in conclusion, we can say that research in the humanities is about to undergo a development probably similar to what happened in the life sciences 30 years ago. It's really a matter of scale. We're seeing work that's way beyond what a research team can do, and that's really new for the humanities, which are very often used to working in small teams or with just one or two researchers. When you visit the Archivio di Stato, you feel that it is beyond what a single team can achieve, and that it should be a joint effort. So what we need to do for this change in perception is to raise a new generation of "digital humanists" who will be ready for this change. Thank you very much.

Frederic Kaplan How I built an information time machine

every publication, directly to your inbox

Written by giorgos

Leave a reply Ακύρωση απάντησης

every publication, directly to your inbox

spread the news

Leave a reply Ακύρωση απάντησης