Connecting Data, People and Ideas since 2016.
01 November 2021

AI + Knowledge - a match made in heaven?

*** Talk from KnowCon 2020 ***




What does graph have to do with machine learning?

A lot, actually. And it goes both ways


Machine learning can help bootstrap and populate knowledge graphs.

The information contained in graphs can boost the efficiency of machine learning approaches.

Machine learning, and its deep learning subdomain, make a great match for graphs. Machine learning on graphs is still a nascent technology, but one which is full of promise.


Amazon, Alibaba, Apple, Facebook and Twitter are just some of the organizations using this in production, and advancing the state of the art.


More than 25% of the research published in top AI conferences is graph-related.


Domain knowledge can effectively help a deep learning system bootstrap its knowledge, by encoding primitives instead of forcing the model to learn these from scratch.


Machine learning can effectively help the semantic modeling process needed to construct knowledge graphs, and consequently populate them with information.


Key Topics

  • What can knowledge-based technologies do for Deep Learning?
  • What is Graph AI, how does it work, what can it do?
  • What's next? What are the roadblocks and opportunities?


Target Audience

  • Machine Learning Practitioners
  • Data Scientists
  • Data Modelers
  • CxOs
  • Investors



  • Explore the interplay between machine learning and knowledge based technologies
  • Answer questions that matter
    • How can those approaches complement one another, and what would that unlock?
    • What is the current state of the art, how and where is it used in the wild?
    • What are the next milestones / roadblocks?
    • Where are the opportunities for investment?


Session outline

  • Introduction
    • Meet and Greet
    • Setting the stage
  • Knowledge Graphs, meet Machine Learning
    • How can machine learning help create and populate knowledge graphs?
    • What kind of problems can we solve by using it?
    • Where is this used in production?
    • What is the current state of the art in knowledge graph bootstrapping and population?
    • What are the major roadblocks / goals, how could we address them, and what would that enable?
    • Who are some key players to keep an eye on?
  • Graph Machine Learning
    • What is special about Graph Machine Learning?
    • What kind of problems can we solve by using it?
    • Where is it used in production?
    • What is the current state of the art?
    • What are the major roadblocks / goals, how could we address them, and what would that enable?
    • Who are some key players to keep an eye on?



  • Extended panel
  • Expert discussion, coordinated by moderator
  • 2 hours running time
  • Running time includes modules of expert discussion, interspersed with modules of audience Q&A / interaction



  • Intermediate - Advanced


Prerequisite Knowledge

  • Basic understanding of Knowledge Graphs
  • Basic understanding of Machine Learning / Deep Learning


A talk by Isabelle Augenstein, Nathan Benaich, Amy Hodler, Katariina Kari, Fabio Petroni and Giuseppe Futia


You can listen to the podcast below:



If you are more of a visual type, you can also watch the presentation






[00:00:00.310]Welcome to the Connected Data London podcast brought to you by the Connected Data London team. Connected Data London is the leading conference for those who use the relationships, meaning, and context in data to achieve great things. We have been connecting data, people and idea since 2016. We focus on knowledge graphs, linked data and semantic technology, graphs, databases, AI and machine learning technology use cases and educational resources. What does graph have to do with machine learning? A lot, actually, and it goes both ways. Machine learning can help bootstrap and populate knowledge graphs.


[00:00:41.350]The information contained in graphs can boost the efficiency of machine learning approaches. Machine learning and its deep learning subdomain make a great match for graphs. Machine learning on graphs is still a nascent technology, but one which is full of promise. Amazon, Alibaba, Apple, Facebook and Twitter are just some of the organizations using this in production and advancing the state of the art. More than 25% of the research published in top AI conferences is graph related domain. Knowledge can effectively help a deep learning system bootstrap its knowledge by encoding primitives.


[00:01:13.990]Instead of forcing the model to learn these from scratch. Machine learning can effectively help the semantic modeling process needed to construct knowledge graphs and Consequently populate them with information. This workshop by Isabelle Augenstein, Nathan Benaich, Amy Hodler, Katariina Kari, Fabio Petroni and Giuseppe Futia will discuss the issues above.


[00:01:34.450]Welcome everyone to Knowledge Connections 2020. This is the second session of the day that we're having today and on behalf of the entire organization team, which is actually two teams, because this is a joint venture, a joint effort between Connected Data London and the Knowledge Graph conference. I would like to welcome you all to today's session, which should be a very interesting one with a select group of people as our guests and panelists. And let me first start by introducing myself from Georgia. I'm part of the core organization team for Connected Data London and Knowledge Connections, and this is the second, as I mentioned, second session of the day, and so a little bit of logistics before we actually get to the more interesting part, and then I will leave the floor to our great team here.


[00:02:31.630]So if you are part of our audience, first of all, thanks a lot for making the time for being here and would like to keep this as balanced as possible, meaning that we have a great group of people here and we expect to learn a lot from the conversation. But we also want to open the floor as much as possible to our audience. So the way to do that without breaking the flow of the conversation is basically by utilizing this chat box that you should be able to see on the bottom right part of the screen.


[00:03:04.150]So after this initial introduction, I will basically go away and my job from that point on will be to monitor everything that's been Typed in that set. So feel free by the way, to start a conversation amongst you or type any kind of comment or question based on what you hear from the people having the conversation in the panel and at certain points in time, I will make sure that your questions and comments are being addressed. So without further Ado, let me start by introducing the moderator for today's panel, Nathan Nathan Bennett, who is also a man of many talents and man who wears many hats.


[00:03:47.410]I came to know him primarily through one of his activities. So Nathan is the author of the State of AI report, which again it also applies to me. I also wear many hats. So one of those hats that I wear is I write for Zidanet and for the second time in a row this year I covered this very interesting report, and I was quite fascinated, to be honest, both last year and this year by the breadth and depth of the coverage that this report that Nathan with his partner, Ian Hogarth, his business partner, Ian Hogarth, produced.


[00:04:28.270]And well, let's just say that one thing led to another. Nathan, I understand, comes mostly to the topic of AI, from the machine learning side of things.


[00:04:40.210]Let's say.


[00:04:40.990]On the other hand, I come mostly from the knowledge, presentation and reasoning part. However, I found it very interesting that we found some middle ground. Let's say that those two disciplines came to meet, and that was the trigger for me to ask Nathan whether he would be interested in moderating a panel like this. Happily, he was. And again, one thing led to another. And here we are today. So like I said, like I promised, I'll take a box with me and everyone on the floor is yours.


[00:05:12.670]And a great conversation.


[00:05:16.450]Thank you so much, George. Everybody can hear me.

[00:05:19.150]Okay, great.


[00:05:22.510]So yeah, as Jose mentioned, very excited to be here. One of the activities I do every year is trying to get a comprehensive snapshot of everything that's going on in the machine learning world, from industry research, talent and geopolitics. And one of the trends that's really popped out over the last twelve months has been this intersection of NLP with Transformers knowledge, graphs as a form of information representation and increasingly, machine learning techniques applied to graphical representations of data so called graph machine learning. I think today we've got a pretty great group of people who cover the different aspects of this topic.

[00:06:09.790]On my side, I come more from the investing side. I run a solo venture fund called Air Street Capital. I invest in AI first companies, and joining me today, we've got group of five panels, so I'll do some quick intros on everybody and you'll get to know them a lot better through their comments on the topics. And everybody has a couple of slides to present throughout the course of this discussion that will help you understand better what they're actually working on.


[00:06:42.070]So first off, we have Amy Hoddler, who's based in the US. She's working on graph analytics and AI, is a program director at Neo Four J, which are Neo Technologies, which makes Neo Forge, one of the most popular graph databases out there. He's also co author of an O'Reilly book on applied graph algorithms in Apache Spark and Neo Ford J, which was published earlier last year and updated this year. Next we have Isabelle. Orgenstein, who is based in Denmark. She's an associate professor at the University of Copenhagen.

[00:07:14.770]And there she leads a group on natural language understanding, and her particular interests include weekly supervised and low resource learning with applications and information extraction, machine reading and fact checking. Joining her, we have Katarina Curry, who's a little bit further north in Finland. She's a data engineer at Belinda, which is one of the major ecommerce marketplace companies in the world. Originally based in Germany, Karina holds a master's degree in both science and music and has a particular interest in specialization in semantic web based in the UK, we have Fabio Petroni, who's currently studying NLP at Facebook EI Research, based in London, which has a growing presence and interest in NLP.

[00:07:57.850]He's previously in the RND Department at Thompson Reuters, one of the major information companies focused on financial markets. And joining us from Italy, we have Gisette Futia, who's a research scientist and knowledge graph engineer at the Nexus Center for Internet and Society. His research interests focus on semantic modeling, data integration, and graph neural networks, which, as you mentioned, is one of the hottest topics in machine learning today.


[00:08:28.270]Overall, in this kind of panel, what we're really trying to get at is what can machine learning be applied for in knowledge graphs and what can graph you for machine learning? And to really unpack that topic, I thought we'd start with some commentary from who has experience in both. And really, we're trying to understand how the two topics, how graphs and knowledge can be connected. Any thoughts on that?

[00:08:51.070]Thank you very much, Nathan, for the introduction. I am going to share my slides.


[00:09:09.950]Can you see them?

[00:09:11.750]Yes. Perfect.

[00:09:15.530]Well, maybe each of the member of the audience have a specific background on all of these concepts related to graph symbols and vectors. And in this brief introduction, I would like to provide some intuition from the relationship between these three main concepts. So let's start with the understanding of which type of graph is a knowledge graph, as you probably everyone knows. A basic graph structure includes a set of nodes and these set of notes connected by a set of edges. Actually, graph has specific features because in this case, the edges are directed.

[00:10:08.010]We can have multiple edges between two nodes.

[00:10:30.550]They just me who lost them. Or can you also see he's gone?

[00:10:35.170]I think we all lost him.

[00:10:37.150]Okay, cool. So I think we're right at the topic of just discussing what a graph is and what a knowledge graph is in particular. Does anybody want to pick up? Maybe I'm just closing up a brief introduction connecting the two concepts.


[00:10:57.890]Hi, this is Amy. So just kind of a quick intro. If you think of how things are connected, whether you're talking about concepts and information and data being connected or something like a transportation network, we have a lot of different connections. And I think as he was talking about, you can think of the nouns in your network, the entities as nodes. And then you can think of the relationship between them as either relationships or edges, depending on what terminology you want to use and any kind of complex system.

[00:11:35.450]When you're talking about lots of connections between those, you can mathematically model them as a graph. And a graph is simply how you represent a complex network or how you represent any network. And then that allows you to do really cool things like either store them natively in their format as a graph, or you can compute over them and do things like graph analytics. So that's just kind of a simple real world. You can think of a graph as just a model of a representation of a network.

[00:12:06.810]One of the aspects it seems quite important here is context for knowledge graphs, at least in the sense of knowledge graph can help represent that context, which is so important for basically everything we do in machine learning. So it'd be interesting to hear maybe some examples of why context are important or how somebody might integrate that into a knowledge class.

[00:12:28.830]Yeah, I love this topic. So I'll do really quick, and then I'll let somebody else speak as well. So to me, context is really the foundation of knowledge itself. So if you think about data in the world, whatever it might be, whether it's biological or economic, it really doesn't exist in isolation, at least not in nature. And that's really what gives our data points meaning. And context is also how we learn. If you think about how a child learns something, they don't learn it in isolation. They learn it in context.

[00:13:02.910]Machine learning and AI in general, without that context, without that fabric of meaning is less accurate. And we see this over and over again in the real world, and it's more rigid so it doesn't flex to different situations. There's a classic story that always makes me laugh. But there's a classic phrase that people use to explain this, and it's simply like we saw her duck. So if I said that to most of you and I was ducking at the same time, you would probably think somebody threw something at her.

[00:13:33.270]She ducked. But it could also mean that somebody that a group of people saw my pet waterfall. That's another interpretation. Or maybe somebody named we came over for dinner and they saw my duck for dinner because we're going to have duck so there's all these different meetings, and without the context behind it, it's very hard for humans to know and let alone AI systems. So context is how we derive meaning from things and how we learn. And so for AI systems in general and machine learning, context is useful for understanding meaning.

[00:14:12.390]And it's also really important for heuristic decisions. So there's a classic like chatbot if you are asking a chatbot. Hey, I want to go camping in like Tahoe next weekend with my spouse. What kind of tension should I buy? There's a lot of context around that question. Like Tahoe. Maybe the weather's really bad this time of year. Maybe you said spouse. So you probably only need a two man tent. You need a really huge one and you say Tahoe. So maybe you want to send them to a new search area.

[00:14:43.230]So there's all this context that we can use to make better decisions. And then the other area with context is it can also just do things like help track data lineage and just understand where your data is coming from.



[00:15:01.650]Yeah. I want to thank Sami for a really good introduction to context, especially in like, knowledge graphs. Providing this, I have another example from the field of fashion at Solando. So one of the first applications in the industry in Solando that we did for the Fashion orangecraft because we created a whole set of vocabulary about fashion and things around it. And one of the first applications here was contextual recommendations. So the recommendations themselves. So if you're on Orlando buying fashion and you see recommended other articles, other products are shown to you, or maybe something comes much further up in your fashion feed on Orlando, in particular places in the website, the recommendations are shown together with context.

[00:16:02.910]So the recommendation itself is in the process of machine learning and different, like deep learning and different kinds of models. But if the seed for it is because you're looking at a blazer, we give you recommendations from other businesswhere that creates a bit of trust and understanding. Senator actually knows what I'm doing here. I'm trying to buy a blazer here. So I'm looking for a business there and then just giving that name. It's very simple process of just checking things up on the knowledge graph and on the product side.

[00:16:41.490]But that little effort of saying, oh, you're looking at something business here are other business clothes already makes a big deal and gives the customer the context they need.

[00:16:55.050]It seems like the example that you bring up in fashion is interesting because it seems like you have a multimodal opportunity because you have images that users can use to describe what they're looking for. And then you have the actual words that they use to describe their search. And then on the other side, you have zelando's description of the products which are presumably made written by fashion experts, et cetera. I'm curious if you have any anecdotes or thoughts on integrating different data modalities into either one global knowledge graph or how much better the experience is if you use images and text as opposed to one or the other.

[00:17:36.390]Yeah. So the way the fashion knowledge graph works at Selando is we kind of know what fashion we call them fashion tax, but basically there are notes from the fashion. So for each product we know a little bit we know the context according to a trend or according to a style like business or formal location. Those kind of things. We also know things like this is warm clothing, so it's suitable for winter and all of that, like the first version of the Passion. All of that was derived from the description of the product.

[00:18:20.290]So it was using some kind of set of logical sentences if this exists there, and if it has one lining, if it has, like, fur inside and it's a proper for winter and so on, it's just a set of rules. But then we added, like, the visual part of it and computer vision to it and taught it things like vintage dresses in the 60s or other things where the pure data of the product isn't enough. But there are some very strong visual cues that are just part of the design that then gives it a certain kind of context.


[00:19:09.710]So this one brings up another topic we discussed, which is how can machine learning be used to actually create and populate these knowledge graphs? Presumably here, machine learning on vision is one of the ways that then you can output labels that your system has already learned from, perhaps like a large corpus, and that can be slotted into a knowledge graph. And be curious what experience and best practices you've all seen there.

[00:19:38.550]I'll just add kind of adding on to what you said about the multiple different types of data. You can add metadata, actual text data, but then also things like the visual data that you mentioned. One of the things that we've seen to help populate the knowledge graph or your domain specific graph is to do like image similarity. So you can represent an image as a vector. And then you can find similarities like graph embedding or other types of embedding and represent the image and then find similar images.

[00:20:15.390]So we see this with like, a customer. We have a customer in the real estate industry where you can say, hey, I like this picture of this bathroom or this kitchen. Find the other bathrooms or kitchens within 20 miles in my price range with this text in the descriptor and then show me those recommendations. So I think you can use machine learning to do things like, I think some of the panel have expertise in NLP, so you can do that with the language, develop your ontology that way.

[00:20:44.970]But you can also do that to represent items as a vector and then do add that to your knowledge graph as well.

[00:20:53.250]I have a of just better understanding what people are looking for when they're doing searches or on this topic, also of the general purpose knowledge graph. I was curious, Isabelle, if you had a sense of how the technology tools that we have available have progressed over the last couple of years, are we in a place where these systems are robust for web scale problems, or if there's still, like, a certain area of problems where we just don't have the right tools yet?

[00:21:28.750]I think in terms of the big picture, we're not that much further than we were ten years ago and such as for everything that's kind of general domain. So thinking about extracting knowledge from Wikipedia to create Wiki data as such, that's something we could already do five or ten years ago. We can do now. And what's really tricky is to populate domain specific knowledge base automatically. I think in that regard, we're no further in terms of general domain knowledge bases. Of course, machine learning has progressed so we can populate those automatically with the higher accuracy.

[00:22:08.050]Now that ten 510 years ago.


[00:22:13.330]Can you explain why it's hard to populate these, like domain specific knowledge graphs? What's the difference with the general purpose one?

[00:22:21.970]I think the difference is that the knowledge isn't so readily available.


[00:22:30.730]Think about scientific publications. Some of them are behind a paywall. So how do you even get the knowledge automatically?

[00:22:39.070]Of course.

[00:22:39.370]Now there is archive and so on. But a lot of papers are still behind the payroll. And that's just one example. Think about I don't know what medical records, for instance, those just private data. So you would never be able to get that.

[00:22:56.750]Out of interest. Has there been any effort to crowdsource or share these sorts of domain specific knowledge graphs through either consortium that trust one another or through trustless setups that might use things like Federated learning or other kinds of Privacy techniques.

[00:23:17.390]So at Selanda, we've been kind of advocating this idea of we're building a fashion knowledge graph of fashion ontology, and we could think about creating maybe a schema. Org extension. But there's a real challenge I currently have, which is probably for anyone who wants to introduce working with knowledge graph in their company is getting even different teams to use the knowledge graph internally. We're just getting there. So first, when we're creating something we were seen as part of the application, I think this is the thing in many companies, they're application driven.

[00:24:02.390]I blame agile methodologies on one hand, because it drives that kind of product thinking, solving a feature, solving a customer problem, and sometimes what then gets less attention is the overall picture of all data architecture and those kind of things slowly. We're kind of because what we have to do is we have to have one success. Sorry, one killer feature. So to say which was the search, then follow the textual recommendations. And now we're giving also context on sustainability topics driven by the knowledge graph. So slowly we're kind of going live with that.

[00:24:43.610]So it's more and more project. And as that momentum increases, that's when we're getting hopefully getting the message through that. Hey, this knowledge graph is part of the ecosystem inside our company. And then hopefully the next thing is to have it also outside of the company. So we really want you for the whole globe to create a data architecture fashion.

[00:25:13.750]Let's see.

[00:25:14.290]But it will require from the whole industry to start thinking differently also inside their companies.

[00:25:24.710]Fabia I'm not sure if you can open on this, but I'm curious from your perspective, because Thompson Reuters, I would guess, has like a lot of this stuff internally, given it's very NLP heavy and mine's news and Facebook has done lots and lots of work on various kinds of networks over the years and even more so in later times. I'm curious if you can kind of compare and contrast like these kinds of organizational setups as it relates to pushing knowledge crafts into production.


[00:25:51.650]So I want to say something before. So I think that graphs are super beautiful structure to represent a lot of things in computer science before we touched on networks or even citation networks or transportation or supply chain, for instance, for finance. But I feel myself a bit of a black ship here because I think the recent trends, at least in research, are showing, at least from my point of view, that probably a graph is not the most the best way to represent knowledge. So actually there might be something better to represent knowledge than a set of notes and edges that miss a lot of subducties of knowledge.

[00:26:43.910]And there are a lot of limitation with knowledge graphs, like they require schema engineering, and they force you to query about a fixed set of relation, and they always need human supervision in order to be ready. I mean, I never seen an automatic way of building a knowledge graph that is successful. But actually, as human beings, we already invented the best way of representing knowledge, in my opinion. That is text like drive stuff. And with the recent advances in NLP, we now have machines that can retrieve piece of context reason on top of that and solve knowledge intensive tasks without using a knowledge base.

[00:27:33.470]But just using text and understanding text. And we are doing a lot of research in this space in London.

[00:27:45.590]Fabio, I have a question for you on this. I'm really curious about how you're thinking about change in meaning over time, because that can be a very difficult thing where something may mean something in like a word may have a meaning in the 1950s that changes over time and today may mean something different, but you want to both understand historical meaning so you can do whatever kind of search or understanding of meaningful that text, but then also understand what it might mean today and then track that drift.

[00:28:24.870]So to speak.

[00:28:25.890]How are you thinking about that? It's kind of an interesting.

[00:28:29.670]I think it's all about context, as we say before, that particular concept, that particular word that is probably in a huge corpus observed in a particular context with a particular meaning just close to other words, to other concepts. And so of course, there is some structure in text, but I think we should let machine to somehow self structure that kind of representation. Then if you ask something to a machine about a concept and you use keywords about a specific time frame like 50 years ago, or specifically, you add some time information, they should be able to refer to the old concept of the word.

[00:29:25.690]And just I want a quick thing and something that you can do. Also to manipulate the knowledge in NLP is just change the purpose. Like not use Wikipedia ten years ago. Use Wikipedia today and that will be enough to update the knowledge representation of the machines. Just change the text day.

[00:29:49.430]If I can add something related to what Fabio has just mentioned, I think that we have to think about two different types of semantics. In the case of NLP, you have the so called distributed semantics in which the meaning of the words is in some way learn from the algorithms analyzing corpus, a lot of corpus of text, et cetera. But in case of knowledge graph, you establish the semantics at the beginning from the top. And in this case you can encode knowledge that in some cases that cannot be derived directly from the text, because in the text in many cases the semantics of the idea is not explicit.


[00:30:45.530]Graph structures such as knowledge graph, maybe it can be used to introduce for instant aspects such as the common sense in order maybe to improve. Also, the performance of machine learning techniques apply to natural language documents or something like that.

[00:31:05.850]May I chime in here a bit?


[00:31:09.870]I used to work on relation extraction and knowledge based population and so on during my PhD, but it's a topic that's somewhat fallen out of favor. Now there isn't that much work now on using knowledge graphs for NLP, whereas there is still work on our knowledge based population. But that doesn't mean that graphs aren't important in NLP. And I have one example I could show just to illustrate that point if I can manage to share my screen here. So that should work. So one example is fact checking, and that's something I'm working on.

[00:31:50.850]So in fact checking to perform an automatic fact check model has to understand the underlying evidence somehow, and there is often a lot of evidence for and against. And it's a really complex natural language understanding problem. So the type of reasoning that the model has to do is actually to take the claim here, which consists of all names, different types of entities and different types of things, and then try to connect that claim to different pieces of evidence and in fact try to connect the entities in the claim to entities in a different piece of evidence, and then try to understand how that connects to another and so on and then arrive at the ruling.

[00:32:34.290]And what we found in our research is that representing this as a graph actually using a transformer based neural network that can incorporate natively performs best for this type of task. So graphs are still important. They're just used in different ways in NLP is my experience.

[00:33:02.230]Wasn'T like one of the longest standing projects. Is this Psych or sick knowledge base that began in the 80s as an attempt to encode like general purpose common sense into a graph of text. And that project has not received much love as you describe and others like it as well.


[00:33:29.690]I just want you to say that of course, graphs are still important, and I agree with this. Abel. They have a huge role in NLP and AI in general. It's just the way in which we represent knowledge as atomic entities connected through relations. So the classical representation of a knowledge graph, that's probably the old way, and we should be moving ahead and have a more flexible representation of knowledge that are more aligned with the power and capabilities of recent machine learning models.

[00:34:17.730]I find that we have a lot of customers that find graphs a very practical way to get started, especially if it's not strictly schema required type of graph kind of let them get up and going fairly quickly with data sets. They already have. So there's a lot of information companies have tons of information that they've already stored in all these different locations and sources and formats, and they've got visual data they have to represent they've got text data they have to represent, they've got information they have to represent for use in machine learning.

[00:34:54.630]But to be able to get something that can touch on the different formats and the different sources. So it can be simply a metadata management system that brings together those different silos and data formats and then lets people on top of that. And I think it was Caterina. Maybe they had said it was talking about being able to get access to information and be able to represent in a way that applications can use. That is really important. And so I feel like graphs as opposed to necessarily incorporating everything inside of a graph.

[00:35:34.650]They're really useful structure for basically acting as that metadata or that fabric to kind of figure out where things are and then you can use whatever application you might have on top of it or use it just to query the different sources.

[00:35:52.150]I have a question for everyone in this panel because I've recently been toying around with this idea of what's happening currently in our fashion knowledge graph is this visual learning and computer vision, and we're giving it like, good examples like these are Sixties vintage dresses, and then these are 17 vintage dresses. Don't confuse them. Well, they might not actually Interlapse, but if there are some, like, interlapping things, like, for example, in shoes with the different heels, kitten heels and palms and those kind of things, there are some examples that can be categorized also under that.

[00:36:38.530]But then our mistake or like the image vector representation is too close, but it's a different thing. So you have to draw the line and say, okay, here. This is a negative example. So I was wondering, like I've been recently suggesting to the team, but maybe I need to find some examples. Basically, the way a knowledge graph or an ontology, actually a formal fashion ontology is done is that siblings don't interlap. So if you have dresses or at least that's the way we've designed it is that it's comprehensive and exclusive.

[00:37:17.110]Like any tree is always comprehensive and exclusive for exhaustive. That is, the whole representation and exclusive with each other. And I was wondering that that is a base of positive and negative examples that is like a data set that they could use. So I wanted to hear if there are any practical examples of using this where kind of using the ontology structure to already give data to machine learning, problem or model.

[00:37:57.650]Please go ahead.

[00:37:58.790]Now, what I was saying is that human curated ontologies are super useful. I mean, don't read me wrong. Those have been usually curated by experts in the field and that they convey a lot of their knowledge into that structure. So you can potentially derive a lot of training examples from these knowledge bases. For instance.

[00:38:25.910]In NLP.

[00:38:27.290]If you have, like, Wiki data, you can create a lot of positive claims on top of this knowledge base, like someone was born in London and so on and so forth. And that can be translated directly into training data for the model.

[00:38:47.310]Two other things I can think of is that a few years back there was this work on retrofitting word embeddings. So the word embeddings will first be learned from text just using a standard type of approach, and then they would be adjusted based on a knowledge base. And if the knowledge base says that 60 stresses, 70 stresses are something different, then the water embeddings will be adjusted accordingly so that they're sufficiently different from one another. For instance, another example I can think of is hyperbolic embeddings. So that's not something I've personally worked on or I'm so comfortable talking about.

[00:39:26.850]But the idea is basically that you try to learn what embedding space on graph structures by encoding them in a hydraulic space. And maybe that's something Fabio knows a bit more about. I bet you remember something along those lines.

[00:39:47.790]I mean.

[00:39:49.830]There are a lot of crazy ways and to embed the knowledge base. For instance, actually, I think to remember that they kind of maintain the Yarrow key of the structures they have in boiling and back things.

[00:40:11.050]But yeah.

[00:40:13.030]Again, I think the main problem is when you have some knowledge that somehow is not incorporated into your knowledge base. That's the part when you have either new information or a new relation or something trend in our days, it's impossible, for instance, inland that someone manually created the knowledge base to say what's trending tomorrow. So those are the difficult cases. Those are exactly the cases where an ontology, in my opinion or knowledge graph.


[00:40:55.070]That more unstructured way of approaching the problem may solve.

[00:41:02.510]I have to fabulous know that it's actually not true. The thing is, we already buy trends years ahead, and the way the fashion industry, unfortunately, looks like a fashion industry is emergent. Suddenly everybody decided to wear a bumback this way on themselves or something like that spoiling the surprise. Yeah, sorry, but it actually decided years ahead by designers. It's like a top down thing.


[00:41:34.430]And I have to say that using ontologies and fashion makes so much sense, because if you look at data, you always look at past data. And I remember this example from fashion expert. She was explaining to the machine learning team that, hey, look, these machine learning teams were crawling the web and searching for trends, and she said, off the shoulder is going to be a big trend next summer. And they were all like, no, it's not. It's not in last summer station. It's nowhere in the data.

[00:42:04.730]And she said, trust me, everyone's going to want to buy off the shoulder. And then that was true. So she knew it before the data knew it. So actually, this is a very particular thing about fashion. That fashion is so top down in its trends that actually makes sense to put a bit of semantics. And that's another thing we're working on is when we have buyers, you know, these trends that they make sure that those trends are already in the fashion on topology. But then there are certain things that are then emerging.

[00:42:41.990]You're right.

[00:42:42.710]That could not then be seen and used within on property. So an interesting.

[00:42:51.650]We see customers that use even before you get into the machine learning aspects that just use the structure itself to make predictions. So it might be you had talked about. If structure is important, you might be looking at something like overlap similarity algorithms to understand supersets or subsets of things like what is a subset of something else? Or we have customers that use a lot of community detection algorithms based on relationships. So that's structure and they'll do that for things like disambiguation. So I have things that are very similar and could be categorized as A or B and this kind of dress or that kind of dress.

[00:43:35.450]But how do I categorize new things based on the relationships that has to. I don't know if it's fabric or color or image, but you could use those things to try to predict where to categorize them. So they use really simple things like that and then more complex things like understanding paths. So we have a lot of patient and customer journey customers that will look at the clicks on a website or if it's patient, the number of visits and the type of diagnosis and what medical test results they get and look at it over like three years trying to figure out what's the journey of this patient, and then you can embed the journey itself.

[00:44:18.950]And that's very graphy journey is kind of graphy and then try to learn what journeys are similar to other journeys, and you do the same thing for customers or anything else. So it's kind of exciting to see where people are kind of pushing that envelope.

[00:44:39.990]I think that also the machine learning applied on the graph structure can be really useful also for recommendation purposes, because in my personal experience, we construct a knowledge graph on the academic publication of my University, and then we exploit graph neural network techniques to analyze this knowledge graph and to identify potential interesting research opportunities between researchers that have never been quarters. The analysis and the automatic recognition of specific patterns in the lengths of the knowledge graph are very useful to identify these unseen parts that can be applied also for recommendation purposes.


[00:45:34.910]I just want to say that I completely agree with both with me and Joseph. So as I said before, graphs are extremely important, like behavioral data, as you were describing, like clicks or searches and logs. Those are super important. And the most natural way to represent those is a graph where citation network, of course, is an explicit graph that is given to you. What I like is forcing the word knowledge into a structured representation, because actually you lose a lot of information in compressing the word knowledge into a graph.

[00:46:13.250]While it's much better to work in the loose, less representation that is, text. But just for that particular category of graphs and knowledge.

[00:46:27.450]On the topic of detecting differences in trends on text and trying to get models to be robust to that. Has anybody worked with these labeling functions, which are sort of like ways of expressing heuristics that experts have and using a set of labeling functions to append these, like week supervision labels on data and then using a neural network to approximate the right weight to get some initially trained data?

[00:47:01.710]Sure, a few years back, I work with distant supervision, which is a way of automatically labeling data for relation extraction. So basically, if you see a relationship in a knowledge base that consists of two entities and the relation between them, and then you see those same two entities in a sentence, you would assume that that sentence expresses that relation, and then based on that automatically labeled sentences with that relation or as expressing that relation, and then use those automatically labeled sentences to train models to tricks regulations that work reasonably well.

[00:47:44.950]And I think people are still doing that to some degree.

[00:47:53.210]Actually, I have also experienced on that, Nathan, because I developed a tool that exploits graph neural networks in order to identify the relations between the attributes within the data source. So for instance, CSV file or an XML XML or a JSON file, and I try to share the screen again if it works this time because I would like to show you an example of these graph neural networks apply to extract relations from the data source.


[00:48:41.750]Can you see it?


[00:48:45.950]Actually, this is an example of an integration graph that in some way more than all the plausible relations between different attributes. And I try to show you these are the attributes that I write it in. Orange. This integration graph describes the domain of the public procurement in which public administration purchases goods or services. And if we consider, for instance, this business entity, we can see that in regards, in relation to the contract semantic type, we can have two different type of relationships, a direct relation which is contracting authority and relations in which the path is equal to two where the notes in the middle is tender.

[00:49:49.470]So we can have, in this case two different plausible semantic relations between the business entity and the contract, and we can assign different algorithms. For instance, if we choose the graph neural networks. In this specific case, we see that according to background knowledge that has been used to train the governoral networks, the correct one is this one. And so the relation between the contract and the business entities, the contracting authority wise. Why the other election is not correct. So this is a simple graphical way to see how different algorithms can be exploited in order to identify the correct relations among all the plausible relations and exploiting this type of visualization.

[00:50:43.890]You can also.

[00:50:57.370]Did he kick himself out again?

[00:51:00.910]Yeah, I think so.

[00:51:01.990]Because she screens the Ejector button. Anybody else want to try and share their screen to get ejected? Does anybody else have any interesting applications of graphical network? Actually, on the topic.

[00:51:23.950]I'll just say we've been using I don't have anything to share because it says I have to restart my browser, and I know that would be I will say we've started using graph embeddings, in particular Graph Sage, which has a graph convolutional neural network underneath it. And what I find really interesting about that is you learn a function of your node and the surrounding the neighborhood, so it kind of looks at the neighborhood, learns a function for it, and then aggregates it. And what's really kind of interesting about that is as opposed to like a node to VEC or Fast RP, which is really cool because it's so fast, but as opposed to something like a nodevec where you have to retrain your model every time you add in new data.

[00:52:11.570]If you learn a function, you can either train on a small bit of your subgraph and then apply it to the bigger graph. So it's faster of the model, or you can learn basically a function. And then as new data comes in, you just apply the function. I think that's kind of some exciting things as far as using graph neural networks in a graph structure itself.

[00:52:45.990]I comment just regarding the first topic like labeling and distance supervision. That is a way of basically dealing with cases where you don't have a lot of data. And this is pretty common, let's say, in business and in real life. I think there is a really cool trend right now that it seems to work really well. That is few shot learning where you basically have just few examples for your label or for your class and the description of the task. And the model for some models actually are able to read the description of the task, see some examples and being able to solve the task a bit like human do.

[00:53:41.470]We don't need thousands and thousands of examples, but we just need a dust description and few examples to submit us. And there is some really cool work in this space by the I folks with GPT, but also in Europe. There is a lab in Munichic and that is working on really cool algorithms to be able to do so.

[00:54:10.490]Yeah, there are potentially two more things I can add there. So one is one thing I'm working on is stance detection, which is about understanding the attitude of something towards something else. And it turns out that people have very consistent attitudes towards topics. So once a machine learning model has worked out what the attitude of a user towards the topic? Is it's possible to weekly or automatically label all tweets about that topic as expressing that attitude? So you can probably imagine that thinking about people tweeting about politics or so on.

[00:54:51.050]The other thing where we supervision works well is in the cross lingual learning, which isn't something we've talked about yet. So languages that are very similar to one another can be useful for crossing or transfer learning. So thinking about training on example from the Nordic Region training model on Danish and then applying it to Swedish works really well when it comes to a few short learning. And one thing we worked on when it comes to knowledge graph, let me share my screen again is we've worked with World Access of Language structures, which is a knowledge graph that includes grammatical features of languages.

[00:55:36.990]So it includes things like the order of subject, object and verb in that language, or how many genders it has or so on. And you can also see here that there is some geographical relationship between these features and how similar they are across languages. And then what we've shown in one of our papers is that it's possible to automatically populate knowledge bases using matrix factorization, taking the language representations as a feature. I'm not sure I want to go over the architecture here, but anyway, you might get the general idea.

[00:56:23.890]And another thing that we've worked on is automatically encoding these types of pythological features, as they're called in networks for cross lingual sharing. And we have found that encoding those helps cross lingual sharing. So if the network knows what these types of structural features are, that helps with our single sharing. And if we specifically try to prevent a model from learning or using topological knowledge that actually hampers cross lingual sharing and the cross lingual transfer performance goes down. So I found quite interesting.

[00:57:15.470]And at the same time, it seems like the fact that it's really enabled these low resource language translations has just been massive transformer models, right?


[00:57:26.330]So massive transformer models have definitely been good news for crossing or sharing. But even when you apply these massive transformer models, it's still possible to get gains from using these knowledge graphs. So in the experiments we've done, we've actually used Bird.

[00:57:42.530]For example.

[00:57:44.450]And so on. If we fine tune it to incorporate knowledge from this walls, knowledge graph performance increases even Bath doesn't know everything.


[00:58:04.530]Just maybe relates to what Fabio you were saying earlier. Do you think there's context or reasons why it makes sense to have specialized knowledge graphs and then large Transformers on top of them to maybe like bootstrap some pretty basic things that we can already encode in the system? Or are you more of a purist and you believe that everything should be learned, like tabloid from text.


[00:58:40.730]Instill knowledge into a transformer? Let me tell you something. I think the best models for knowledge intensive tasks. Maybe I can share something in a moment based on a transformer, but something more so it's not just about the transformer. You need a component that brings knowledge into the transformer. I mean, there is some knowledge that remains stored in the barometers of the transformer once you train it. But that's definitely not enough to answer knowledge intensive tasks. So this component can be a knowledge graph or can be actually a retrieval system on top of a knowledge source.

[00:59:29.650]So what you can do for domains where you don't have either a knowledge graph or in general, you need to condition to a specific knowledge is just to replace the knowledge stores, collect a lot of togglements for that domain, and you basically create an index on top of these documents and let them retrieve relevant information that it needs as a context to answer a particular input. Let me share something.

[01:00:00.310]This is quite effective. I know. In like, dialogue systems, for instance.

[01:00:05.230]Definitely. Okay.

[01:00:19.350]Can you see something?

[01:00:21.570]Yes. Okay.

[01:00:23.910]One of the latest project I've been involved to is the creation of a benchmark that is kilt a benchmark for knowledge intensive language tasks where we basically collect a set of tasks that cannot be solved just by looking at the Logan's context, but need to condition the answer to a huge amount of text. In this particular case, the world Wikipedia in English and these tasks span from slot filling that is basically knowledge based population. And here you can see an example or open domain question answering dialogue, fact checking, anti linking.

[01:01:06.330]And the nice part of this benchmark is that all these tasks are aligned to the same version of Wikipedia. And so the same version of Wikipedia. And what we ought to do with this benchmark is to compare all this different way of representing knowledge both in text as a knowledge graph or Al noise vector and score the models that are better. Again, at the moment, the best models are based on a reader and a river, and there is no knowledge graph in that so far, but the competition is live, so everyone is welcome to participate.

[01:01:51.990]Keepbenchmark. Org.

[01:01:55.770]What'S currently the best in class on this benchmark?

[01:02:00.270]The best in class is actually Rug that stands for retrieval augmented generation. It's basically a transformer that has a retrieval component, but that retrieval component is differentiable. So during the training process the model learns how to retrieve in order to maximize the overall performance. So also the retriever is updated during the training step, so the gradients are propagated through both the river and the Red River jointly.

[01:02:39.870]May I ask when you say it's enter and differentiable? Does it actually do the whole retrieval in an enter and differentiable way? Because most of the things I've seen just do re ranking in an enter and differentiable way, but not the full retrieval process.

[01:02:56.610]They do rank. So you're right. In the sense that what we do is basically Shunk Wikipedia in pieces of 100 tokens, and then we rank those according to the legality of being relevant to reverse your query.


[01:03:18.150]And then there is the reader that is on top. What is red ripped in order to provide an answer? But yeah, something where we lack a bit of performance with this approach is like most and what you were discussing before. Isabelle, I think we need something more.


[01:03:47.050]Some additional component to this architecture to reason on two multi.

[01:04:01.790]Is there a best practice around getting quality labels for this for these kinds of tasks and corrections on what the model gives you.

[01:04:15.930]So what we did is basically collect a set of data set that has been created by the community, the research community, and those are mainly collected through mechanical Tark.


[01:04:30.970]And there are some practice like collect a lot of examples for a single session to be sure that people are not cheating.

[01:04:49.490]But do you think we've reached the point where we need more targeted or more kind of nuanced ways of getting labels that doesn't just require or doesn't just rely on the segmentation that enter gives you.

[01:05:07.650]Actually, there are some tasks that are highly segmented. So where the output is an entity, so that entity is like a Wikipedia page, some other tasks are more open ended, so the output is like an explanation. There is a task called Le Five explain like Camp Five, that is, the data proposed by others in Facebook last year where basically they crowdsourced, not the crowdsource. They crawled Reddit thread where the explanation is free form and is pretty long. So that is definitely not just factual knowledge and not just, let's say a category.

[01:05:55.470]It's a free form text for an explanation for an input.

[01:06:07.930]Sorry, no, go ahead.

[01:06:09.850]That's actually a good point. Speaking of explanations, I said I work on fact checking in that context. Also on explanations for fact checking, and what we did is to use the annotations journalists have made, actually. And also the explanations and fact checking is a really difficult task to do for people who lack the required background, politics or public health, and so on for that type of thing. I wouldn't rely on MTech, but yeah, it's what most people in NLP do when they create. They just collect some data and process them.

[01:06:57.230]Fact checking, I think, also brings up an interesting issue, which is around answers that are not necessarily binary, where there might be opinions or perhaps some uncertainty or some kind of distribution over, like reality, I guess, or like accepted reality. And I'm curious whether any of the approaches you've seen work particularly well for integrating that kind of uncertainty or if there's a major roadblocks to doing so.

[01:07:26.030]Yes, most of the approaches I've seen they just assume that there is some kind of evidence which is factual and not opinionated.


[01:07:38.750]Most research just ignore that and assume the evidence is reliable. Actually, I've not seen any research that assumes that the claims themselves are unreliable. So there is work on identifying check worthy claims and also our work. But research on, say, fact checking, graphic prediction itself just assumes that the claims as part of the data set are claims and not opinions. There's still a lot of work that could be done in that area.

[01:08:12.750]Just a quick question. I follow up on that. Have you seen the use of just basic similarity? Whether you're talking cosign similarity or Jakart or something like that to give you, like, a percentage. I'm just thinking that might give you a probability of 80% like this, but we're not 100% sure. So maybe it's 80% likely to be true. Is that being looked at.

[01:08:38.250]So basic cosine similarity works well for identifying evidence or doing some really basic fact checking, as you said. So just taking the cosine similarity between some kind of claim vector and evidence vector works quite well and then also identifying previously fact checked claims. Something like that works quite well.


[01:09:05.890]I just want to add that there is a third option. It's not always binary, and the third option is not enough information. I mean, simply, the model was unable to assess the claim. So I completely agree with Isabel that MTAQ is absolutely not good for checking real word claims, but the kind of claims we had in our benchmark, some are typically generated from Wikipedia, and those are much easier than what you see in real life. And for real life, you definitely need extra fact checkers to be able to assess a claim.

[01:09:51.970]I put somewhere I don't remember. Where was the paper that I think it was one company, maybe for fact, I don't know. I may be wrong that they were using a five scale system, but then they decided to go for the three scale. So yes, no or not enough information just because it's clear because you may have mostly true. And people claim that what you have assessed is absolutely true.


[01:10:24.230]I think most work actually does that humans are really not very good at being able to tell whether a claim is mostly true or fully true or these types of finer grain system. It's actually better to just go for maybe three way system or there can be another label, which could be I don't know. The claim itself is just opinionated or not worth checking or so on, depending on the data set. But in terms of just veracity, I think three way system.

[01:11:02.470]Isabelle, you mentioned that in research there's no particular effort. Well, there's just the assumption that the claims that you're working on are true and people don't really spend time on other kinds of assumptions. And I'm just curious, like if there's been any evolution in the academic setting of how benchmarking tasks are developed and the extent to which they're increasingly mirroring real life contacts, or if they're still in pretty artificial settings which are crafted because the real world is messy. And for some reason people don't want to work on that stuff because maybe it's less popular.

[01:11:40.510]I don't know.

[01:11:41.950]I think there are two schools of thought. There are some people who create these artificial benchmarks for just being just developing better neural networks for the task. And the example that Fabio showed us is one of these. And then there's the school of thought that really wants to develop realworld data sets and really solve the real world task. And I think most research in that area is more focused on applying previously research methods to those real world problems. And those efforts are really valuable. But I've not seen that much work on solving a real world problem, and at the same time, coming up with a new neural network architecture, it's mainly the new models are mainly developed on these artificial settings is what I've seen.


[01:12:38.090]If either of you have an opinion on I guess what the roles are of academia versus big companies in the context of some of the work that we're discussing, which from the outside it increasingly looks like it's very compute intensive. Do you think there's some types of problems or research endeavors that are kind of better placed in big companies or kind of industrial groups and then other ones that are better placed in academic settings?

[01:13:11.190]I can go first.


[01:13:15.910]You'Re right.

[01:13:17.050]Nowadays you need a lot of computational resources to train some models. And historically, academia this kind of resources. But at the same time, for instance, what we do in Ferry is to open source every code we write and every model we play. And this has been super useful for labs without a lot of computational resources just to use some models to fine tune some models to do research. And I think the key is to create more synergies and more flow of people from academia and industry. We have researchers that are 50% in academia that are professor and 50% in the industry.

[01:14:15.790]We have a joint PhD students with academia. We have a lot of interest from universities, and we also contribute to University. We again teach and seminars. So, yeah, these two words need to be really close together.

[01:14:43.970]Go ahead.

[01:14:44.210]Justabie thank you.

[01:14:46.550]If I can add something about that, I believe that the role of the academia should be to identify a new research direction, because obviously large companies have a specific purpose that are related to their products. And maybe the academia can play the role of exploring new ideas and new direction of research and then integrating in the second time this type of results related to this new direction of research into the company. So maybe this can be a good distinction between the type of research that have to be conducted from academia companies.

[01:15:35.310]So I still agree with both of you. As far as getting academia and business having more cross fertilization, I would say that the commercial customers we work with that do some of the most impressive things and have the best proof of concepts and production success are the ones that you also see at research conferences. If you have a data science team and you're in the commercial space and you want to kind of up level their game, I would definitely say make sure that you're funding and sending them to some of the academic conferences, so you can see the different directions that things are headed.

[01:16:17.850]They can have a network of people to bounce ideas off of, because in isolation, it can be really difficult for a small data science team to really figure out what direction.


[01:16:31.450]As much cross fertilization as you can do, and then both ways, though, the commercial space can also help the academia community see how messy real world data is, how terribly formatted, how much duplication there is, and just the challenges of even getting global teams to agree to certain formats and things like that. So I think there's a lot that they can teach each other. And I'm glad you guys brought that up.

[01:17:04.690]Yeah. If I could chime in here from maybe representing the business side is that.

[01:17:18.110]In a way.

[01:17:18.590]It is not as useful from business side to go to academic conferences because they are talking in a level that is beyond solving a problem, but really like a deep level of solving a problem of this problem, very little details. Many of the talks are just too detailed in somewhere, like even the mathematics of a mechanic or a structure, whereas the overall problem that is being solved is less discussed. I would say in this kind of conferences, what I would love to and what I know has worked in the past.

[01:18:03.210]Also, when I was previously working in the digitalisation of classical music, was to have these, like, hackathon type of events. There's one called Music Texas, for example, which is awesome. You should check it out, but it's like these places where you had PhD students just forgetting about their research and coming to hack for a weekend and then having industries and businesses be there and represent their problems be like, this is my problem. These are our problems. So if the business can be really like this is not for us to be really vocal about the problems and then have people from the like me, these kind of events were kind of find the quick solutions or try something new out.

[01:18:59.070]So I have this research and I have that and I have that technology. Oh, they have that problem. Okay. Let's try here this and this and this and that kind of laboratory working. I think it's probably the best way to get hands on. But you have these people who have a lot of knowledge of different applications, and it really becomes concrete for the businesses. Only when you start solving the problem, that's the only way I've learned to even discuss anything with anyone. It's always been the problem space and not so much in the solution of and technology space.

[01:19:33.210]And I think conferences are very much in the solution space. Right.

[01:19:39.090]I agree there. And to give one more example from the area fact checking. Again, there are some fact checking conferences which are really about approaching the problem from different areas, not just automatic fact checking, which I work on, but also from journalist point of view or policy point of view or so on. And one of the things I've learned attending these types of events is that it's really important to think about why one would want to do automatic fact checking. So just displaying warnings to people saying, hey, this is fake news doesn't work unless it's done well.

[01:20:17.670]For instance, I think it's really worth going to these types of events that focus on problems and not just on solutions.

[01:20:27.330]Have you seen good examples of companies that have contributed high quality data sets that can serve as benchmark improvements for academia on certain tasks. An example would be in self driving where for a long time CV teams and University we're using Kitty, this data set that if you look at the pictures, your iphone can do far better than the pixelated sort of 90 style images that people are using. And now AV companies are seeing a lot of the data sets. I'm curious in topic of NLP or graph.

[01:21:04.230]If there's been any other kinds of good examples of this transfer.

[01:21:13.330]We have Facebook research. We open a lot of data and data sets that are extensively used by research. I'm thinking about deep fakes or eight full names recently or I don't know, miscon in biomedical domain, but they are endless and also in NLP. On the dialogue side, there are a lot of data that's created by our lab.

[01:21:50.950]Do you think that that's like fruitful though, is it actually yielding the results that you're hoping?


[01:22:05.750]I mean, I think creating the expensive it costs a lot of money for the innovation. It's good that there are companies that allocate budget and that create some data that is not just proprietary data, but then it's shared.

[01:22:27.530]With the research community.

[01:22:31.970]I mean, there are some labs that don't open source. I think that's bad either model. I believe in openness, so we should be open, at least in the research field.


[01:22:47.850]On the topic of openness is something that's been kind of quite hotly debated based on my Twitter verse, especially ever since papers with code, which is part of Facebook research nowadays, has done some really neat integrations with archive, and at least like flagging the issue of Openness and archive submissions. I'm curious, what level of openness would you generally advocate for the idea that's represented in the math and in the words in the paper, and then you have the code to run the experiments and then you have the dependencies and the libraries that sit below and then maybe some other dark secrets basically that may or may not be communicated.

[01:23:34.770]And be curious to know what level you think is possible generally by the industry and what level is desirable.


[01:23:44.230]In research, I think we should be 100% open. I mean, the results should be 100% reproducible. Everything should be so you should be able to read them even what people did. And this is what papers with code and a lot of other efforts are trying to do in the research community. Of course, industry is another word. A lot of companies build a success on top of proprietary data. And for instance, it's my company where I have been basically ISL data. I call it the data for finance.

[01:24:23.710]And of course, in that case, also, I did a lot of stuff in order seeing a lot of NLP data. There is still another set of topic classification that is widely used from Broilers, so I think that they should find a balance, maybe you open source some stuff, a particular subset or just to. Also, it's super helpful for the company, because if the research community starts working on the problem because you can for free get better solutions for your problem. But it's not really a common practice.


[01:25:05.270]One thing to add here is that many machine learning conferences nowadays actually have a requirement that the results should be reproducible. So for example, ACL in Europe and so on. There is a checklist that researchers have to fill in when they submit a paper. So it's more and more really being forced by the research community when a paper is presented or they're not published.

[01:25:37.890]I think that also the frameworks released by large companies for neural networks and the plan in place of the winter role, because if we think about, for instance, by Torch now, it's currently used a lot for research activity. So I think that it could be a really great it's actually a great contribution from companies, also in the direction of the academia and the researchers.

[01:26:12.490]And then maybe to move towards a lot of parts of discussion here. What are you kind of most excited about particular projects or kind of research teams, like in the next twelve months or two, three years?

[01:26:36.770]So I guess I'll go first and I mentioned this before. I'm most excited about some of the things that people are doing with the graph embeddings because it allows them to use them and graph based models use them in a lot of different scenarios, very kind of abstract types of learning. There was some really nice work done by Google DeepMinds and I think University, Edinburgh and some others a couple of years ago on relational inductive biases and basically using the relationships in your network to create bias and create.

[01:27:17.690]They call graph networks a little different than a graph neural network, but basically using a graph structure and embeddings to do your machine learning. They did a really nice job of kind of laying out this structure that you can use to create a graph native learning. So learning inside a graph where you start with a graph, you do learning and then you end with a graph. And what's really cool about what they've shown is that you can actually track transient States. If you can imagine doing that, then your domain experts have a much easier time, kind of tracking back how something actually learned, which is kind of a big deal if we think about people being comfortable and the society being comfortable with deep learning and other kind of AI systems.

[01:28:12.050]So I think for me that's really super exciting. And they also showed that you can have do more learning or quicker learning with less data, which of course we all know small data is a huge issue and potentially doing multiple different types of feature engineering and learning at the same time on a graph. So I'm most excited about that work right now, and I think we're probably a couple of years from seeing that commercially available, but we are seeing people starting to kind of edge into that space.

[01:28:42.350]Now. First thing I'm excited about now currently is finding some process mining into a graph or combining it with graph information. It's a very recent idea, but we're kind of advanced in doing process finding to understand if certain ways in which we handle customer emails or customer queries, or sending reminders about invoices and those kinds of things, how they actually are internally moving in the company's databases and the systems and who's processing and when and how to improve it based on time and those kinds of things.

[01:29:43.150]But I think.

[01:29:45.730]The hypothesis what I would love to work on next, is to see how those processes, actually they're like signals inside the data network of the company. And as a process is mined, we start understanding the data architecture also bottom up and how it shows we have data structures that are less used. And we have this data avenues that are used a lot and how to then improve our data systems in the company from those findings. And in the way it's so natural, because that's what graphs are about.

[01:30:24.250]It's like the natural way, the network way of how data flows and to just paint that for the enterprise, because usually you have these really like, rigid enterprise structures being put on top. But actually, the way knowledge and data flows works against it. And it would be really nice to make that apparent with process mining and using graph, maybe graph visualizations as well. Yeah.

[01:30:54.950]Process mining has been a very hot topic in enterprise with regards to automation, what people call robotic process automation as a way of just like discovering what are your repeatable process.

[01:31:08.390]And then it shines light on the data architecture that's actually being used.

[01:31:15.290]The visualizations I've seen look like spaghetti hairballs. So I'm sure you can do a lot better than that these days.

[01:31:22.370]Katrina, is there research you recommend on that you're excited the most about.

[01:31:28.070]So currently, what's happening is that we have, like in my Department where we have separate team working on process binding, so they know about the research in itself, and then I look at it from a data argument point of view. So no, I haven't because I just kind of had a discussion with them and came up with this idea. Actually, I love to know if anyone else knows any research.

[01:32:04.050]I just know companies that sell this as a service.


[01:32:21.970]I think that probably one of the topics for the future is related to the explainable AI.

[01:32:31.150]Because actually.

[01:32:35.170]We work in context in which the impact of deep learning techniques will be really relevant, for instance, recruitment tools or medical diagnosis. And so in this case, the explainability is not only a desired property, but in many cases will be also a legal requirement. So maybe all the research related to this type of topic would be really interesting. I think in the future.

[01:33:05.110]Yeah, I completely agree. That will also be my research focus going forward. I just want a big research fellowship on this one. I'm planning to work on expenditure for the foreseeable future. It's really a problem that's largely unsolved. I mean, even just how to evaluate explanations. What makes a good explanation, something there's no consensus on yet.

[01:33:34.610]Isabelle, have you guys looked at Cynthia Rudin's work on explainability? It's really kind of fascinating and not always using machine learning techniques if they're not necessary, especially deep learning because they're harder to explain. See if I can get a link to her work. It's really some fascinating alternatives. I think if you haven't seen her work.


[01:34:08.730]There are actually a lot of exciting directions in this place. Explainability. I do agree. Explainability.


[01:34:18.210]It needs a push in the research. Something else I'm thinking about is language generation. We now have models that can generate incredible text, and if you think a bit about it, that's a bit of retrieval that they do. They retrieve stuff from things that they have read and something we are starting to explore. If we can improve and exploit this generation to improve the drill, maybe condition the generation on a set of things we care about. And we recently had some success in red living entities in this generative fashion.

[01:35:02.250]Another direction that we are currently working in London is how to combine reinforcement learning and knowledge, how to bring knowledge into reinforcement learning system. Like why you have an agent playing again? How can you condition the action of an agent not just on experience like try and fail a lot of time, but on a knowledge source that tells you a bit about the game, the rules, which kind of monsters are you going to fight against and this kind of stuff. So how can we combine in general knowledge and experience?

[01:35:43.710]And this would be the giddy to bring AI for workers who are quite common sense in the process.

[01:35:55.390]We have a question from the audience which is around elaborating on the use of knowledge graphs for the purpose of explainability. Is there some good examples of how that works or how it might work?

[01:36:17.630]I don't know about knowledge graph, but what we did in Kilt in this batch Mark I presented before is inserting and explainability score. Basically, you should not give me just the answer, but also a passage from Wikipedia that supports that gives some evidence for the answer. That, of course, is a step in that direction I can think of in an Orange graph. You can potentially provide the path, like any reasoning path for your model.

[01:36:57.370]Or whatever.

[01:36:58.030]I mean, any image, something that shows to a human that you get there for the right reason.

[01:37:08.150]Actually, the use of knowledge graph for explaining. I think that there is a lot of space of research, because if you try to map, for instance, the input features on a knowledge graph and also the output results into a knowledge graph, maybe we are able to map a symbolic system that can be useful to understanding and interpret the results of a machine learning algorithm. And Moreover, knowledge graphs are naturally built also for being querble. So in this case, you could perform also queries on knowledge graphs, and you can also obtain also results using different artifacts, for instance, images and also text.

[01:37:58.530]The other interesting thing is that knowledge graphs are built from domain experts. So maybe the results that come from a symbolic system can be much more understandable for experts on a specific domain. Also, if these people are not so expert on the technology behind the neural networks or deep learning architectures.

[01:38:26.530]One of the things I find interesting in the top of explainability is when we talk to businesses, they don't define explainability in the same way that technologists do. As far as, like, explain how this decision was made, and by what path was it made? Which is how I think most data scientists and engineers are thinking about it. A lot of the business we talk to when they say explainability, they mean, can you explain which data was used? Where can you explain how that data was collected and what impact having it collected by a man in a certain geography may influence what's recorded versus a woman?

[01:39:09.070]Is the data itself biased? Are we double counting data because it's actually represented in two data sets? And so this whole kind of governance of the data and the model is kind of how most of the business people we talk to think about explainability. So that is something that when I talk to groups, is just trying to figure out what type of explainability are we looking at? And from a business standpoint, graphs and knowledge graphs are really great because the information that context is already there.

[01:39:41.530]Kind of like we've already talked about from a technology standpoint. I think both Isabella and have brought up a lot of points about things that you can do to help explain. But I feel like there's so much more we can do, and it's on the horizon for explaining how decisions are made. But hopefully in the next few years we'll get a little bit further. I don't know how you guys are seeing it, but it seems like we're just scratching the surface there.


[01:40:11.530]And sometimes it can be something really simple like that earlier example about saying that because you're looking at a blazer. Here are other business clothes that you might like. So that already suggested you from business might already afford, from the customer's point of view, be enough of an explanation all because I'm looking at this and it's a business thing. I'm getting other business suggestions. So I think in your touching on a really important topic, which is who is your audience for the explainability. So you can have legislators like you were talking about or human interest groups, human law groups or NGOs that are advocating for people being retrieved.


[01:41:11.490]Auditors who want to check that you're correctly using the data and those kinds of things. Or you can have just customers who are like, why am I seeing that? And then I'm seeing that and they just want their customer experience browsing on fashion. So be explained to them in just very little detail with not much debt at all.

[01:41:38.230]Yeah, I'm curious if you have thoughts on which domain applications explainability is really crucial as opposed to is like hyped or is it seemed to be important, but it's not.

[01:41:58.730]I think areas like whenever it's really used for decision making, like in medicine or law, explanations will be important, but I'm not sure how often machine learning models are actually used in these applications. So on one hand, from a theoretical point of view, obviously, it's important that whenever they use the decision making, there is an explanation. On the other hand, I think it's a bit over hype because not many people would use such models in practice for such important things.

[01:42:31.730]I don't know if I completely agree with that, Isabelle, especially in finance, where you get machine learning models and insurance that are used to make predictions on risk. And so that infects people's insurance rates. And then you also see them used in like, what credit is offered to certain people. And I think there was a high profile incident not too long ago where a fairly well known and well to do couple. The woman had a much poor credit card offer than the husband did just based on what the machine learning model was predicting.

[01:43:18.210]I think there's a lot of things that happen that are seem rather small but kind of nudge in certain directions. And that's why I think explainability is important in a lot of areas, not in every area. If you're doing a movie recommendation, probably not as important as medical, but we do see it, especially in finance, insurance and actually even policing a little bit, which is a little scary. And there was also the issue of using people's LinkedIn and resumes. And it turns out that women and then predicting who would be a good candidate at a particular tech company.

[01:43:56.250]And it turns out that women talk about and use different verbs in their resumes than men do. And the machine learning was picking up on that. So they stopped using it.


[01:44:07.890]These little things can kind of nudge people in certain directions.

[01:44:15.310]I agree. And I do think that explanation can be useful for practitioners as well. We need to learn more on how these beasts work, and one way to learn more is to look more fine grained into their prediction, not just through a fault or yes or no. But we need to understand why they are going there, and this can help the community to bring this system to the next level.


[01:44:52.550]I think the explainability part that's quite interesting is in the context of uncertainty as well. Can you understand where the model is not doing super well? And then can you perhaps collect more data or in the case of life science or whatnot can you conduct some real world experiments that you can use to retrain that concept is really interesting. I just feel like there's a lot of real world applications where machine learning can do a lot of good and where it seems like people are gatekeeping it because of explainability.

[01:45:20.150]Whereas what we have to date with humans doing a task is not good enough.


[01:45:30.730]A project by the EU, and it's an AI, and they did a lot of really good work to kind of look at. And this isn't just graph or just machine learning, but to look at a larger AI systems and things you want to review and assess, and they actually put together a big topic. And I think I agree with everyone. It's just a lot to be done. And it's not easy. But that doesn't mean we get to kind of wait five years or ten years and see what happens.

[01:46:02.230]I think we all kind of have to do what we can and assess what we're doing. And they did a really nice job of putting together an assessment checklist just to kind of say, is your AI ethical? And it's just I think it's like 100 questions you can start looking at. And so it gives at least people a starting point to start thinking about that. And yeah, as opposed to just saying, hey, this is hard. I'm just going to throw out my hands at it.


[01:46:30.370]Do you think at some point there's some tests that one can use or that we can kind of agree on that would be generally applicable to certain types of either like architecture or tasks.

[01:46:44.630]I think we are starting to see frameworks developed. I don't think it's going to be an easy you pump your recommendation system through an evaluation. I don't think that's coming soon, but I think what we are going to see is we're already seeing governments put in guidelines and start to review. And I have a link to the EU ethics assessment. So they're already starting to put these things out. And the US government, I think a year and a half ago, the NIST Department, the Department for Industry Standards and Tools is starting a project as well to take a look at what should the guidelines be?

[01:47:27.110]What should our standards be? And they haven't put anything out as far as what the standards should be specifically. But I think what we're going to see is more of a framework. And have you evaluated for X, Y, and Z and have you checked your data for bias? Have you done a post model assessment to understand what's the risk if we get the prediction wrong, is it just they get a bad movie or do they not get credit? So you kind of do this assessment risk as well.

[01:47:57.110]And I think that's probably what we're going to see is people and regulations that move us towards. Have you followed a framework?

[01:48:10.950]Yeah, that's interesting. It has definitely been a new construction, at least in health care, to do with clinical trials. And then you diagnostic models that start to explicitly ask, how did you collect your data? How did you determine that your model is not biased in a variety of different categories? Because these things are really important and an interesting anecdote in the context of Kovid. And if you read the article in The New York Times, but it turns out like one of the clinical trials for I think it was Moderna was delayed by two weeks because the government told them that they hadn't sampled the population that they're running a clinical trial on in a diverse enough way.

[01:48:55.530]So they had to go recruit trial participants from different demographics that weren't properly covered. And it turns out that if they probably hadn't done that, then Moderno would have released their results ahead of the election results. And then what would have happened has been quite different than what did happen.


[01:49:19.610]I guess. Like in the last few minutes, do you have any suggestions of good resources or kind of researchers or groups that people should pay attention to if they're interested in kind of keeping up with state of the art and knowledge, graphs and machine learning, any favorite kind of newsletters or people or places that they should keep in touch with?

[01:49:41.150]We're all going to say this conference, right.


[01:49:44.930]We'Ll come back next year.


[01:49:50.370]I'm also a big fan. I put the link in the chat of the KDD conference.


[01:49:58.110]Is a little more research on the bend, but you have a lot of industry that participates in that as well. I think they're doing a lot in that space. So those are two of my favorites there.

[01:50:12.390]I can also recommend the AKBC conference, which I just posted the link to, but I might be a bit biased because I organized it one year.


[01:50:32.350]So I guess with that, we'll probably wrap the discussion. I think it's been really fun covering a lot of different ground, a lot of ideas. We dropped quite a bit of notes in the chat as well, so folks can follow up with some of the topics that we discussed. Thank you, everybody, for participating and for joining and wish you a good rest of the day and morning out in the US.

[01:50:57.730]Thank you.

[01:50:58.690]Thank you very much.

[01:51:00.370]Bye bye.

[01:51:01.030]Thank you.

[01:51:01.690]Bye, everyone.

[01:51:04.670]Bye bye.

[01:51:06.770]We hope you enjoyed the podcast, to get more of our own material and to keep up with the latest industry and research news from our Domain we invite you to connect with us. Connected Data London has an omnichannel presence. Besides all major podcast platforms, YouTube and SlideShare. You can find us on Twitter, LinkedIn, Facebook, and Instagram. You can join our Meetups or you can keep up with our new and special offers by joining our mailing list.


Connected Data World 2021  All Rights Reserved.

Connected Data is a trading name of Neural Alpha LTD.

Edinburgh House - 170 Kennington Lane
Lambeth, London - SE11 5DP