In this #LØRN episode, we meet an Associate Professor at the University of Stavanger and co-founder and CTO of Factiverse, Vinay Setty. The conversation focuses on artificial intelligence, and his company Factiverse, where the main focus is to empower the journalists to produce high-quality journalism using cutting AI and NLP. The company believes they can help save up to 2 hours of time for journalists every day. We will also learn more about why he is so inspired by Spotify.
With Vinay Setty and Silvija Seres
Velkommen til Lørn.tech - en læringsdugnad om teknologi og samfunn. Med Silvija Seres og venner.
SS: Hello and welcome to a Lørn podcast. I'm Silvija Seres and my guest is Vinay Setty. Welcome, Vinay. This is going to be a podcast that sorts under the topic of Artificial Intelligence. We're going to be talking both about your startup "Factiverse", and we're going to be talking about your research work at the University of Stavanger, in addition to that. So to get us started, basically, I just like to ask you who are you, and why do you like working with AI?
VS: Yes, I'm an associate professor at the University of Stavanger, I have a background in Data Mining and Machine Learning, and Deep Learning. AI is a loaded term, in my opinion, a lot of things are labeled as AI, but in my view, Machine Learning and Deep Learning are what we people usually label as AI. So I have been working on research on natural language, processing, and information retrieval since 2009. I get excited with what we call "Unstructured Textual Data" and "Large-Scale Data", and gathering interesting information out of it, what we call as "Information Extraction" and "Data Mining" from the "News Archives Classification", which is a text classification problem ranking and recommendation in news. I got excited about these problems since 2009 and that's how I got started in AI.
SS: Where was your studying and how did you end up in Norway?
VS: I'm originally from India. But to pursue my Masters, I went to Germany from 2008 to 2010. That's when I got first exposed to information retrieval and Data Mining problems, and then I moved to Norway to do a Ph.D. in Computer Science at the University of Oslo. After that, I did a bit more traveling back and forth, Germany, Denmark as a postdoc, and associate assistant professor, and finally ended up as an associate professor at the University of Stavanger.
SS: Very cool. You’ve worked with Spotify?
VS: Yes, during my Ph.D., I had an opportunity to visit as a researcher at Spotify. It was a six months internship, where I worked on their notification system, which is used for music recommendation, what we call as "Publish/Subscribe System", and funny enough, I was actually sitting adjacent to Daniel Ek, the founder of Spotify's office, and I was quite impressed by him taking on big music labels, convincing them to change their model of music distribution. This was, I think, in the hindsight, an inspiration for me to venture into entrepreneurship as well.
SS: Yes, we need to talk to you after the conversation, because what we are trying to do in Lørn is to use AI and the personalization and the recommendation engine to achieve what Netflix has achieved with people and their new viewing habits, or Spotify with people and their new listening habits. I think we need to change people's learning habits just the same way, by making learning fun and easy and kind of self-guided. It would be really, really cool. There is a lot of learning data by now, the question is, who can make sense and a good prediction engine out of them? I am wondering if you could tell us a little bit about also your startup. So, what is "Factiverse" and why did you feel like you needed more work to do on the side of being a professor?
VS: To be honest, commercializing or entrepreneurship was not on my list of ambitions. It was a research work, which started at the University of Stavanger, and the timing was right. It was in 2017 when we started this project. It was the time when Brexit and the U.S presidential elections were happening, and misinformation and fake news were the major problems, and given that I have experience in Machine Learning, Data Mining and especially on Text Data and working on news archives and using AI for recommendation and extracting information from that, why not use this expertise and technology for solving a timely and important problem, like Thickness Detection and Fact-Checking? That's how the research project started. Our technology transfer office at the University of Stavanger it's called "Validay", they are always looking for commercialization opportunities, so they talked to researchers and approached me, and suggested that this could be an excellent opportunity to commercialize this. I was a bit hesitant at the beginning. It took a lot of convincing before we decided to do it. First up, we applied for a patent and that's the gateway for commercialization. We got it within one year, and then "Factiverse" was born.
SS: What's the patent about?
VS: The patent is about a deep neural network architecture for detecting fake news, fact-checking. So given a claim, an actual claim, or an arbitrary claim, and if you also provide supporting or disproving evidence, and it also considers who made the claim, what the claim is about, where the supporting evidence is coming from, the deep neural network holistically considers all these features together and comes up with a prediction, whether the claim is correct or false based on this information.
SS: Just help me understand, so it looks at the possible confirmations for the claim in other sources or how do you compute?
VS: Yes, I think it is an excellent question. It is important to understand and set expectations right on what deep learning and AI technology can do. It's not a truth machine. It is given a claim and provides related documents. Then it is classifying whether it is supported or disproved from these supporting documents.
SS: And if you can't provide related documents, that is proof in itself, I guess.
VS: Right. If there are not sufficiently related documents for this, then this is inconclusive or insufficient data to make a prediction.
SS: Very cool. Is there a special kind of challenge working in Norwegian? Can this work across languages? Or how does that work?
VS: You're right. We started with English first because that's where we have sufficient training data available. We are working on prototypes for supporting the Norwegian language as well. The main challenge in the Norwegian language is the availability of training data. We rely on manually fact-checked data. There are manual fact-checking organizations both in Norway and internationally, who select claims that they want to fact-check, then they spend days or weeks researching on it extracting relevant documents, information, and sources on this, and they come up with a verdict that this claim is true or false. So the deep neural networks we have trained our bootstraps are based on this training data from manual fact-checks. There is abundant data available for English, but it's a challenge for fact-checking in Norwegian. However, there are deep neural networks that can work in a cross-lingual setting, which is that we can train the deep neural network in one language and adapt it to make it work in another language. Right now we are working on that.
SS: Very cool. So in this "Factiverse", I love the name, it's a universe where facts matter, do you believe that it will be a problem that is going to be possible to solve by this technology, or is it a hybrid solution? Is it a technology to help you halfway, and then, the people will have to use whatever comes out to make up their own judgment? I guess what I'm trying to get to is that I believe that there is never going to be a world where we don't have to use our critical thinking. What do you think?
VS: I totally agree with you. This fact-checking is a difficult problem and the truth is objective, it is impossible to build a truth machine. Instead what we try to do is provide all the information supporting or disproving the claim, and then let the users decide for themselves. So, instead of coming up with a binary label that this claim is true and this claim is false along with that, if we also provide that this claim is true because we found these sources, whether it is from scientific literature, or it is coming from Wikipedia, or it is coming from statistics databases, or even from credible news sources, based on this information the model of deep neural network came up with a prediction that it is a true claim or false claim, and the users are in our case journalists or researching for writing articles, they can decide for themselves how to use this information.
SS: Very interesting. Where do you want to take this? What's your hope for Factiverse?
VS: So we started off as a fake news detection tool. In fact, our first prototype was a browser extension, and with the click of a button, whatever you're reading, an article, a claim, or any sentence that you select, you could get a verdict that it's true or false. But then, after talking to people in the media, especially within Norway, for example talking to journalists and debating with a news agency in Norway, the largest news agency in Norway, it became clear that they need tools for researching and fact-checking rather than a fake news detection tools. Journalists do not have a problem with fake-news detection, they are smart enough to distinguish whether something is fake news or true news, but instead, what they need help with is while writing the article. They are working under time pressure and they do not have sufficient time to fact-check what they have written, and we thought that our technology could be adopted as what we call as "EA Editor", where we can check the claims or statistic numbers as you are writing the article, and give you an overview of the claims which we think that are not credible, or that don't have sufficient support evidence to back this up. Maybe you need to do more research, or maybe you need to change it or maybe you just made a factual error and that needs to be fixed.
SS: What do you think about being an entrepreneur from Norway? I know you're based in Stavanger, but I would like you to comment a little bit with your international background on the pros and cons.
VS: I think that Norway it's an excellent place to be for start-ups. There is excellent support from the government. There is Innovation Norway and fortunately through there, which is the Norwegian Research Agency, there are sufficient funding opportunities for startups to get bootstrapped, and there are excellent accelerators. We are part of a start-up lab and Validay is also having a start-up accelerator program. So all these accelerator programs and funding opportunities make it an excellent place to be for startups.
SS: And there is I guess the issue of the Norwegian language in the market, but on the other hand, as you say, what you're doing has international potential. I think AI will have a special language-related role, it will have a very interesting kind of local-global aspect to it because there will be something that will have to be done in local languages.
VS: Definitely. Our goal is to first master the applications for the English language, and then, adapt it to Nordic languages and other European languages, so it will be like a 1-1 go-to place for fact-checking, automated fact-checking, and for researching when you're writing articles in any language that you would prefer.
SS: I also like the points that you made to me before the podcast, that Norwegians are avid readers of newspapers and consumers of news. I think maybe we are the country in the world that buys the largest or spends the most per capita on the news. So I guess it's a good hunting ground.
VS: Yes. In fact, that is one of the reasons why being in Norway is an excellent situation for Factiverse, because in Norway there is about 98 percent of digital literacy and internet penetration, and as you mentioned, a large fraction of Norwegians pay for the news. In terms of also, the number of newspapers in circulation, I think per capita, it's one of the large largest in the world. So, if we were to experiment with using automation for fact-checking, then this would be the perfect crowd.
SS: Are you able to find the talent that you need?
VS: The talent that we need in Machine Learning/Data Science background. It is not easy to find but fortunately, my association with the University of Stavanger, where I teach Data Mining and Deep Learning, enables me to recruit talented students. In fact, the two data scientists that we have working in Factiverse are from this Data Science program at The University of Stavanger.
SS: Very interesting. I also feel very positive about you being a researcher with this experience in both entrepreneurship and commercialization. But how is this moving your research?
VS: I view this as complementary to my research, because my research is also about Text Mining, using AI for natural language processing applications, and fact-checking and fake news detection is an excellent application of that. The research prototypes are research ideas that we develop at the University, testing how it works in real-world situations through the startup. I think it's a perfect synergy. It goes hand in hand, I don't view this as two completely disconnected jobs. The new ideas that come up during my research work can be applied in my job in Factiverse.
SS: It's a good data hunting ground and a good place to apply what you are researching. I also wonder if you could give some advice to people that are done studying since AI is a really important topic going forward. How would you recommend them to learn, and start using perhaps?
VS: I recommend them to get started with some projects. There is plenty of data available online. For example, Kaggle has lots of competitions where the students who are finishing can compete. In addition to that, I advise them to do internships with startups or even established companies. For example, if we could, we would be happy to have qualified candidates as interns in the summer at Factiverse.
SS: So with me and Lørn, that's a great point. I think that people underestimate also the value that they can bring in with a good mentor on the side I guess. Have you looked into things like digital online courses or shorter courses on AI? Personally, I love this book about the Deep Learning Revolution. I think that's a very inspiring place to start. Do you have a book favorite?
VS: I like a book called "Dive Into Deep Learning", but I forgot the author.
SS: This is good advice. I haven't read it. Why do you like it?
VS: It's a good balance between theory and practice. First, it explains the theory, and then, right away, it gives coding examples. You can even open it as a Jupyter Notebook, which is a kind of tool for Python Programming, and then start experimenting right away. They have a nice library that includes the necessary datasets where you can experiment. It's a good learning experience.
SS: Very cool. Actually sounds like fun. I like to talk about how we need to make these sandboxes and start playing with these tools, not just look at them from the distance. It sounds like a really good playground for starting with AI. Do you have a role model within research? Is there somebody that inspires you? Especially in your field?
VS: Yes. During my stay at the Max Planck Institute in Germany, I had a mentor. He works on natural language processing and information retrieval. He's not an AI guy per se but he was my role model to get interested in using Machine Learning using Data Mining and Text Mining techniques for solving problems using natural language processing and news processing.
SS: Very good. What's your most positive experience from Corona?
VS: The most positive experience is that we have become more productive. For example, since most meetings have moved online, it has minimized the travel time. For establishing Factiverse, I might have had to travel between Oslo and Stavanger probably 50 times, but this has saved a lot of time for me.
SS: Yes, I agree. Also, we all work a little too much in this new efficiency. It's very interesting, I spoke with some people a couple of days ago about Corona and the thing they miss the most is that travel time. Actually, the travel time to and from meetings and the breaks, the moving between the meeting rooms at work, those natural breaks were a part of our workday and now we did them out with all these new digital efficiencies but to be honest, we are working ourselves to the ground with no time to kill up your coffee cup.
VS: Yes, I totally agree with you. On the one side, we have become more productive, on the other side, everybody has the "Zoom Fatigue".
SS: Yes. Zoom and everything else. The other thing that I find interesting is, and it relates a little bit to AI, we will have these wonderful new tools but I notice how much I miss the humans in my colleagues. We work together and we deliver together, but every meeting is a transactional meeting. We don't spend much time in Teams or Zoom or on the phone just chatting and, frankly, I always thought I didn't have time for that but now, I really miss that.
VS: I think it might change the way we interact with people moving forward. We view this as a tool for assisting human fact-checkers or journalists, instead of replacing them. They will be the ones deciding which conspiracy theories or which misinformation they want to fact-check, just assisting them making it faster for that.
SS: Very good, Vinay Setty. I really enjoyed talking to you. Thank you so much for spending time with us here in Lørn and teaching us about a very cool, new application of Artificial Intelligence.
VS: Thank you Silvija. It was a pleasure talking to you.
Du har nå lyttet til en podkast fra Lørn.tech- en læringsdugnad om teknologi og samfunn. Nå kan du også få et læringssertifikat for å lytte til denne podkasten på vårt online universitet Lørn.university.
Your background: Who are you and how did you become interested in technology?
I am an Associate Professor at the University of Stavanger. I teach and conduct research in Data Mining and Deep Learning with applications to Natural Language Processing (NLP) and Information Retrieval. Since 2009 I have been working on research problems related to extracting information from news archives, classifying, ranking, and recommending news. I also spent 6 months at Spotify as a visiting researcher and worked on interesting problems related to music recommendation and notification systems for music. Even though I never dreamed of founding a company, I was inspired by Daniel Ek founder of Spotify, who convinced huge music label companies to change their distribution models. I was sitting adjacent to his office when I was at Spotify. In 2017, realizing the gravity and importance of misinformation and disinformation as one of the societal challenges, I thought about using machine learning and AI to solve this problem.
What is your most important role at work?
I am the co-founder and CTO of Factiverse.
What is the main focus/goal at Factiverse?
Our vision is “A universe where facts matter”. Our main focus is to empower the journalists to produce high-quality journalism using cutting AI and NLP.
What is your most important project this year?
We are working on an AI editor for journalists to research in writing articles and fact check at the same time. It will also serve as a dashboard with several insightful analytics ranging from bias detection, timelines, and social media coverage, etc. We believe it can save up to 2 hours of time for journalists every day.
What do you think are the most interesting controversies?
Verifying the truthfulness of facts is difficult, it is impossible to build a truth machine! Instead, the best we can do is to provide all the information related to the claims from credible sources and let the users decide for themselves.
What do you think needs to be considered for the future?
A lot of fact-checking happens manually today which could take days to weeks. False facts spread 100 times faster than true facts. We need more automated tools to stop false facts at source before they spread.
Is there anything we do well in this field here in Norway?
Norwegians are one of the most avid consumers of news in the world. Norway has the largest number of paid subscribers of news in the world (48% pay for news). With 98% in Norway access to the Internet, digital literacy and awareness are one of the best in the world.
What do you think is the most important takeaway from our conversation?
Automating fact-checking is a hard problem but there is hope with cutting-edge AI technologies and natural language processing.
Samle deg med en venn eller en kollega for å se om du klarer å svare på spørsmålet nedenfor.
Fake news spread 100 times further than the truth. How is your newsroom dealing with misinformation?
Want to show off this case to your friends and coworkers?Download summary (Available soon)