Meet Your New Lab Partner – AI as a Research Assistant in Data Science | Dr. Gabriel Avelino Sampedro

Transcript

All right, so for my talk today, it’s going to be entitled, Meet Your New Lab Partner, AI as a Research Assistant in Data Science. So my talk today might be quite controversial, as some people might think of using ChatGPT as cheating. But I hope my talk today will inspire more conversation regarding this issue and maybe give you a different perspective of this issue. So before I start my talk, I’d like to share this short serendipity. So it’s a quote from Larry Page. Artificial intelligence would be the ultimate version of Google, the ultimate search engine that would understand everything on the web. It would understand exactly what you wanted and it would give you the right thing. We’re nowhere near doing that now. However, we can get incrementally closer to that. And that is basically what we work on. So I forgot what year Larry Page said this, but- Surprised.

We’re here today where AI, where ChatGPT can actually be the next Google. Whatever it is that you want to know, you can actually ask ChatGPT and it most likely, well, there’s a good chance that it will actually give you the correct answer. So hopefully this talk today will inspire you and show you the different perspectives in using AI, which I believe is just a tool. So we can use it for good, we can use it for bad. Part of my discussion would also be ethical usage. So for the agenda for today’s talk, I’ll first talk about what AI is, give you a brief explanation of how it works, what it is all about. And then I’ll talk about the different uses of AI specifically in research. AI can be used as your development partner, as your data science partner, and even your research partner. I’ll also discuss different ethical considerations, dependencies, future prospects, as well as final thoughts on this matter.

So first, let’s talk about what AI is. So instead of doing a Google search on what AI is all about, I asked ChatGPT, what is artificial intelligence? And ChatGPT 4.0 responded, AI or artificial intelligence refers to the simulation of human intelligence. in machines that are designed to think and act like humans. These machines are programmed to perform tasks that typically require human intelligence, such as learning, problem solving, reasoning, understanding natural language, and recognizing patterns. So interestingly, ChatGPT, or sorry, or artificial intelligence in general, is meant to simulate human intelligence. The way we think is pretty much the basis on how AI was designed.

So AI was actually designed to think like humans and it has really evolved to what it is today. As mentioned earlier by Professor Flor, Chat GPT was even used to ace Japanese Medical Exams, which is quite alarming. Alarming and maybe it can also be an eye-opener if we look at it in a different perspective. Is the way that we’re giving examinations correct? Is this the approach that we want or maybe should we be testing other aspects instead of theory? So these are things to probably ponder about.

So AI has drastically changed. AI began and it started even in the early 1942. So it was basically a simple mechanical calculating machine built by Blaise Pascal and it now has evolved into what it is today. AI was as simple as you challenging an AI opponent in chess. And it’s even a subject right now offered in a lot of universities. At UPOU, we actually have courses related to AI. It’s pretty much a given course to offer, especially for universities that wish to be progressive, universities that want to keep up with the times. And interestingly, there is also a university right now that’s even looking to launch its first BS AI program. So we can actually see the evolution of AI and one of the earliest encounters I’ve had with AI was playing on my phone, chess. So AI, again, it was designed to think like humans. It was designed to create decisions, decisions based on my move. So the AI’s next move will be based on whatever it is that I do in chess. And AI has evolved. used to be pretty much an if else problem, but it became more complicated than that. And it even evolved to doing much more complex tasks.

So this right here in the image that you see is Moosho. A couple years ago, I was working with a company called AKA AI. It’s a Korean company that actually designed this robot. So this robot Moosho is being used by Japanese and Korean students. to learn English. So instead of hiring an English tutor, the kids get to talk to a robot and talk to it and learn English. So we actually developed the data used to train this robot. So again, it is meant to teach English to non-native speakers using conversation.

So now that AI has really evolved, we now have a lot of tools at our disposal. Some of these tools are things like ChatGPT, Gemini, Microsoft Pilot, Anthropics Cloud, Jasper, and even IBM Watson. So there are a lot of players out there. It used to be, well, ChatGPT was one of the earliest, I believe, and then a lot of companies have started offering their own solutions as well.

So how does AI work? So AI works by us providing inputs and then the system basically learning. So to simplify things, Let’s try to recall the way we learned when we were kids. When we were kids, we barely knew how to talk. So we only learned about what apples and donuts are by seeing them. So oftentimes when we were kids, we were presented with different images of apples, different images of donuts, and then slowly we tried to extract information from that, and we tried to slowly understand what makes an apple, what makes a donut. So similarly, machines AI systems are thought similarly. So imagine the development of an AI system of a machine learning system that recognizes cats and dogs.

So in order to build a system like that, you would need a bunch of images of cats and teach your system to understand what exactly is a cat. So you present your system with images of cats. as well as different images of dogs. Your system tries to extract certain features from that. Let’s say, try to look at the whiskers, ears, what are the different elements that make up a cat? What are the different elements that make up a dog? So we’re extracting features, building a model, and then slowly we try to understand and learn what a cat is and what a dog is.

So now that we have with knowledge of AI. Let’s talk about how AI can help us, especially in research. So one of the main uses of AI nowadays would be in development. So AI can actually be your dev partner, your partner in developing, your partner in coding. So in programming, we’re usually given problems like this. Given the grade point system on the right, calculate the grade of a student. So we’re given different grade point averages, different equivalences, different descriptions, and we’re asked to create a program that if you give the percentage, it will give you the grade point average. So old school, if you want to do this, you would have to create an if-else statement, if-else statements.

But nowadays you can actually ask ChatGPT to create a program solves the problem as is. So when I fed this image to ChatGPT, it actually gave me this. ChatGPT pretty much did all the hard work for me. So some of you might think it’s scary, but again, it will depend on your perspective of things. You might think that only ChatGPT will be replacing coders. So interestingly, I actually had a chat with my professor at Berea a week ago. And we’ve been discussing how our laboratory in South Korea started to invest in ChatGPT. I even asked if it was for paper writing as not all the lab members are native English speakers. Interestingly, my professor told me that no, they invested in ChatGPT to help our researchers code. So a lot of lab members now use ChatGPT for coding. It can be quite frustrating if we think that our future programmers are liking what ChatGPT But if you think about it, do we really want innovators or do we want code monkeys? So code monkey is actually a term, kind of derogatory, but it’s used for those who will likely just code without thinking. You’re given a task, you’re basically like a robot. You’re given a task, you’re given like a use case and you have to solve that problem and code it. So interestingly, yeah, ChatGPT pretty much replaced the code monkeys.

Also recently I joined Block Dojo, which is a venture builder that invests in idea stage startups. So I actually wondered about the whole setup. If it was a huge risk, since most of the founders in the program weren’t really tech people. They were just a bunch of people with ideas and investing in them. Isn’t that a huge risk? That’s initially what I thought. And then I got to talk to the organizers and apparently they were really more interested in ideas than tech. as developers can easily be hired. And in the same way, development can easily be passed on to code monkeys. It can easily be passed on to tools like ChatGPT. So there are a lot of platforms out there that allow us to outsource execution of product development. And these things are a bit more technical.

And… Amazingly right now it can be done by ChatGPT. Of course, with some human intervention, of course, because it’s not perfect yet. But even if we believe that ChatGPT can later replace coders or even like the junior level developers, it can be quite scary. But if you think about it, it might make sense because at least you get to focus on other things like idea generation. And these are things that are. AI cannot really replace. So you can always hire a bunch of coders, but it will be much more difficult to hire someone to create an idea. So as educators, we might also have to rethink our approach given that all these new tech can be used to our advantage. So aside from ChatGPP, there are actually different rules out there used for coding. So in using AI as your development partner, there is also this tool called of GitHub Copilot. So GitHub Copilot, I’ll be sharing a video. I hope you can hear it.

So in this video right here, you can see how AI can actually be used to provide codes. So you simply just need to give the system what you want, your requirements. How do I design a responsive layout in Primer? And then GitHub Copilot provides you even the code later on. So it’s actually, there’s a lot of opportunity and potential in this part. As AI can amazingly be used for coding. Create a new button component, type it in this tool. GitHub Copilot and it actually provides you the exact codes that you need. It saves you a lot of time because as developers, we would, if we don’t know something, we’d often resort to Googling it and we’ll scroll to hundreds of pages to find the right solution. But right now with AI, we can easily generate these things. We can easily ask Copilot as well to do things like… finding out latest issues and even integrating it with different tools.

So AI is not just, it now, it is used right now to generate code snippets and even entire functions. So there are a lot of tools out there, not just Shatchi TT, there’s again, Co-Pilot and other cases out there. So aside from generating code, AI can also be used to review code for best practices and standards. So we can actually feed our code to different AI platforms and identify potential bugs, vulnerabilities, and so on. So example of the different tools out there for coding would be DeepCode, Sync, and CodeGuru.

Also, AI can also be used for testing. So whenever we perform testing, we usually resort to automated testing. AI can now create test case based on code changes. So we can use AI to do functional testing of our different programs. So these AI tools can include test sim as well as applet tools. So we can actually see how AI helps us save a lot of time, even on tasks like programming, which can be quite repetitive. So given that we have a lot of load, taking off our backs, we can now focus on things that AI can never do.

So with all these tools of development, we can focus on just being creative. We can now think of different solutions and get the help of AI in the development process. Because in encoding, you spend a lot of time ideating, designing your program, and even a lot more time in coding. With ChatGTT helping you out. you can focus more on the design phase, which is something Chats GPT cannot do. Next is AI as your data science partner. There is this tool, so GenAI.

So when talking about using AI as our lab partner, there are many tools that we can use for data science. Again, Chats GPT is also one, it’s one of the most popular tools, but there’s also other tools out there like… Altrix, so you’ll learn more about Altrix in this video and see its different potentials. Altrix that give you the power of an enterprise-grade solution with the ease of a spreadsheet. Welcome to the Altrix AI platform for enterprise analytics. Break down data silos with drag and drop ease by creating a unified view of your data with over 100 built-in connectors. Drag and drop pre-built tools to transform, clean, and prepare your data for analysis. or write your own custom code, all within an easy to use self-service UI. Let Aiden, the Alteryx AI engine, help you along the way with AI powered suggestions. Built for technical and non-technical users, Alteryx shows you not only what happened, but also predicts what may happen next. It presents the factors that affect your goals and makes understanding trends simple.

Generative AI provides use case discovery and rapid prototyping. This saves time and manual work, getting you from data to answer faster than ever. And with Alteryx, your data is not just powerful, it’s protected with embedded enterprise grade security and governance. Let Aiden be your guide as you’ve forecast trends, build machine learning models and inform your business strategies. Even non-technical users can get hands-off with education mode and AI built-in. Join more than 8,000 customers and over… 500,000 community members to improve revenue performance, manage costs and mitigate risk. The future of analytics is here. You’ll actually be surprised.

A lot of companies now invest, really invest in AI. So as you’ve seen, a lot of big name companies even invested in alterings. So in terms of data science, a lot of coding is involved. So in data science, AI can actually do so many things the speed of the process. So for one, in one part of data science is actually data cleaning. So right now you can actually feed your… feed ChatGPT with your dataset, and even ask it to do things like make your data much more uniform, spot errors, fill in certain gaps. These housekeeping tasks that would require a lot of time, a lot of attention, now can be done by ChatGPT. It can be done by AI. So again, it would all still depend on the way that we prop. I actually use this in analyzing certain sheets, and there will be times, of course, that it’s not. perfect and analyzing each and every cell might be quite difficult. So we have AI tools right now and it would all depend on the prompt. So it would boil down basically to prompt engineering. So simply put it, our jobs right now would include making instructions for AI to be much more clear and precise instead of actually coding itself. We’re evolving to this new era.

Aside from this, we can also use ChatGPT or AI in general to create charts and diagrams. Since AI systems can do Python, we can create Matplotlib diagrams for presenting data. So interestingly, when it comes to companies, a lot of companies hire data scientists, and these data scientists try to understand different graphs and present them to stakeholders. Now that stakeholders can… easily use chat.jpg, we can now generate these graphs easily without much coding. So again, we can now focus more on what matters like decision making. We can also focus more on idea generation and of course ease up the tasks of our existing data scientists. AI can also create predictive models. as well as classification models. Again, all of these code snippets that we need. So through that, we can now focus more on scientific productivity. Next will be AI as our research partner. So this can also be quite controversial as a lot of journals out there have even released different statements on the use of AI, if it’s allowed or not. A lot of journals have even embraced the use of AI. but of course have provided guidelines for its usage, same as UPOU with its guidelines.

So aside from using ChatGPT, there are also other tools out there, like Scholarcy, which we can use for research, analyzing data, extracting data from different papers, and you will learn about it more in this video. I need to have a clean desk. Generally I light a candle to get my space cleared. Before I start studying, I always need a cup of coffee. Always, things aren’t going well. Retaining information, remembering it. I have quite a lot of reading to do. Finding my sources. Being a master’s student, I have so many post-bugs, exams and research projects going on. I don’t stay back with my learning at all. That’s a key challenge, I think. Like if I’m reading a journal or a book, I can just put that into Scolisee. They’ll pick out the key points and I’m then able to transfer them into my assignment. Scolisee basically does that instantly. Using the tabular extraction feature, I was able to export them all into the Excel sheet, which made my job very easy. It helped me to understand the key point of the research paper, to summarise the article so it’s easier for me to put it into my assignment instead of spending hours and hours reading. I’d probably go to the bottom.

In this video, you actually saw, and I’m not sure if you noticed, but there was text right here, it also popped up here, saying that you can now focus on what’s more important, and that’s actually the beauty of AI. A lot of tedious tasks are now being done by AI, so that we can focus on the more important stuff. We can focus more on designing, coming up with plans, coming up with ideas, things again, AI can never replace.

So when we’re doing research, we actually do a lot of reading. So when we survey and analyze hundreds of papers, most of the time we don’t end up using even half of it. So survey a hundred papers, sometimes we’re only going to use 10, 25, and that’s a lot of time being spent on reading things that you won’t even use. So good thing about having AI with us right now, we can easily synthesize papers. feed a document, a PDF to ChatGPT, and ask it to give us important points, summarize it, ask it if it’s related to our study. And from there, if it says it is, that’s the only time then that we can, we should read through it. So we can actually use ChatGPT to screen through papers and tell us what to read and what not to read. AI can also be used to extract important information from these sources, different data, different results. And then, Especially if you’re going to do a survey in the future, you’d have to read hundreds of papers, extract all of their results and then tabulate them. So this is something we can do now with the help of AI.

Aside from that, ChatGPT, AI in general, can be used for grammar checking. There are a lot of authors all over the world whose first language isn’t English. And this can ChatGPT or AI in general, is setting some sort of equal footing for everyone. It’s promoting equality as well. So you have authors from countries like China, wherein the authors have a hard time expressing themselves in English, but they have really great ideas. And right now they can share their ideas with the world through AI, AI helping them translate their work, fixing the grammar, making it easier to read. We have different tools out there, like Grammarly. your AI partner for anyone who has work to do. So Grammarly is now being used for proofreading. So imagine before when we had to proofread the work, we usually send it to actual proofreaders. Right now, AI can help us ease up our load.

And for proofreaders, proofreaders also use tools like Grammarly so that they can focus on the more important stuff like content. There are also other tools out there like Quillbot. So whenever I write papers, there will be times when I would come across a writer’s block. And Quillbot really helps. Quillbot actually has this function that would suggest the next sentence for me. And this would help keep my flow going. Another way Quillbot actually helps is through paraphrasing. So might sound quite controversial, but Quillbot can… paraphrase and help you ease up problems when it comes to plagiarism. So I’ve authored a lot of papers in machine learning and in most of these papers, I would have to define certain concepts over and over again. Given that I wrote a lot of parts in the introduction, I tend to have a similar structure in terms of writing. So imagine writing 10 papers about ML or AI, you’d have to explain the concept 10 times too. And when you explain something, you tend to use the same words. I mean, it’s you writing the same thing over and over again. So you might tend to use certain phrases that may lead to self-plagiarism. So it’s something not intentional, but it’s just the way you write. You have a certain style and you end up using that style over and over again. And again, you’re explaining the same thing over and over again in different papers. So, Tulbot can actually be used to paraphrase these things.

So there are phrases again that You say over and over again, they may be flagged as plagiarism. And you can use tools like Quillbot to improve your writing and do paraphrasing. So with tasks like proofreading, grammar structure, out of the way, we can again focus on the more important stuff. So in terms of research, it’s not all about just writing. It’s also part of discovering new things and creating novel work. So despite such, there are still a lot of ethical considerations that we need to think about. So there are many applications of AI, but are all of these correct or not? So later on, I’ll also be discussing a framework in a research article related to how we decide whether or not the use of AI is ethical. So AI isn’t perfect. So some of the considerations that we have is that AI isn’t perfect. there will be times where an AI will commit a mistake. I remember when ChatGPT was new, it sucked at math. Give it a math problem, it will give you the wrong results. So later it has evolved, but again, we cannot be too reliant on AI. Human intervention is still truly needed. We cannot replace humans. So given that AI still commits mistakes, we have to watch out over it. we have to check the responses, of course, before counting it as the truth.

So even though we use AI to Google right now, remember when you still Google, there’ll be times where in Google which show you wrong results. In the same way as in AI, ChatGPP can give you the wrong results as well. This year, Google even admitted that it’s AI overviews tool which uses AI to respond to search queries and… and stuff actually needs improvement. So while the internet search giant said it tested the new features extensively before launching it two weeks ago, Google actually acknowledged that the technology produces some odd and erroneous overviews. So examples include using glue to get cheese to stick to pizza or drinking urine. to pass kidney stones quickly.

So these are things that we really need to watch out for because again, AI isn’t perfect, it’s made by man and it’s still prone mistakes. So human intervention is truly needed. Also AI is a model tool. Simply put it, it learns from our inputs. Given that it learns from our inputs, it can be biased. So everything that we give to ChatGPT is learned by it. So this can also produce erroneous results.

And this is actually one issue recently, wherein artificial intelligence tends to be biased. It tends to be biased in terms of race. Again, it would only depend on the people who use it. So if a lot of racist people use it, the results tend to be racist as well. It’s cute to think that way. Other ethical considerations. to consider in the use of AI would be privacy. So there’s still a lot of debate right now as we don’t know where our data is stored. When reviewing journal articles, you’re often informed that the use of AI is not allowed.

So it’s not allowed not because we are being prohibited from using a tool like AI to help us. It’s not allowed simply because we don’t know where the data is stored. So whenever… peer reviewers review work, they review work that isn’t published. And feeding it through AI would actually mean giving that sensitive information to AI, something that isn’t published before. And again, we have no control on where the data is right now.

Another issue right now is in terms of IP. So in terms of IP, there’s ongoing debate on who owns things. AI right now can even be used to replicate styles, which tend to be a gray area in the law. And who owns AI generated content as well? It’s also something for debate. Is it the AI company? Is it the person who generated it based on results?

Again, AI is a result of prompt engineering, prompt engineering made by the users of AI. So without user input, there is no output. So who owns it? Also without… without the developers of AI, there is no AI. So we can go on, spend days discussing this, but for now, it’s pretty much an open debate. So taking a look at things, AI is pretty much just a tool. So you can’t really blame a person, you can’t blame a gun for killing, but the person who holds it. In the same way as AI can be used for bad, AI can also be used for good. The accountability still lies in humans.

We’ve heard a lot of talks about cheating with ChatGPT. Of course, this is something that isn’t ethical. We, they’re still responsible use in terms of AI. Also for instructors, professors, we can actually use this as an eye opener to things. I was having a chat recently with a law professor from University of the Philippines as well. And this lawyer professor actually uses ChatGPT in his class. So it’s actually sketchy right now since it was being discussed earlier that students answer essays using Chachi PT, of course, which is obviously wrong, something we shouldn’t do. So we all, but in the same time, we also need to embrace the use of AI.

So this law professor that I’ve talked to, He talked about how he used ChatGPT in class. So oftentimes he would give like a problem, a case, and then he would also ask ChatGPT to comment on that case and then ask the students to comment on the response made by AI. So in this way, we’re still promoting the learning process of students learning from the discussion. And at the same way, we’re also embracing the use of AI. So maybe this is something we can pick up too. So there’s this framework made by Patrick Banneret, if I pronounced it correctly, discussing the framework on creating ethical AI.

So here’s actually a simplified discussion of it. So first, we start at the very bottom with a use case. This is pretty much the foundation. So we have to understand first why we are using AI. We need to ensure that the purpose is ethical, aligning with moral values. Again, if we’re going to use AI, it should be used for the good. So if we’re talking about using AI to cheat an exam, obviously that’s wrong. That isn’t ethical at all. So if ever we were to use AI for good, then good. Let’s move on to the next part of this framework.

So now that we understand the use case, we have to set a goal. The goal should also be clear, ethical, and objective for the AI system. After which, we move on to the building blocks. In terms of complexity, we have to understand AI’s complexity and how it can be managed. Next would be accuracy, bias, and variance. We have to ensure that AI is accurate and free from biases. I’ll be discussing an example later on so this becomes clear. And then next would be the hidden variables, identifying any hidden factors that may affect AI’s decision.

Next we have to weigh the risks and costs in the decision process to minimize error, to minimize harm. And of course, after developing any system… comes Lifecycle Management Continuous Monitoring, explaining the AI actions to users, maintaining ethical standards throughout its use. So this framework actually helps us integrate ethical considerations into every stage of AI development from the initial idea to its ongoing use. So let’s use self-driving cars as an example to explain this framework. So for the use case, our objective, of a self-driving car is to ensure that the person moves from point A to B safely. So this is still in accordance with conscience, it’s still for the greater good, therefore it’s still ethical.

Next we have to talk about the goal. So the car’s goal should be defined well. The car’s goal is to navigate roads safely, avoiding accidents and following traffic rules. Next we move on to the building blocks. In terms of complexity. We have to consider that the car should handle various driving scenarios, such as different weather conditions, unpredicted pedestrian movements, etc.

Next, accuracy. In terms of accuracy, the car needs to accurately detect and respond to obstacles. So it will be on the road, it would need to avoid certain things, ensuring that it does not favor certain driving conditions or mis-rear events. Everything has to be considered.

And later on, we have to consider the hidden variables. So you might have heard a lot of debate relating ethical dilemmas in terms of self-driving cars, like who should you hit in terms of an accident? These are the hidden variables that still need to be considered later on.

Next would be the decision support. So in terms of decision support, the system must balance the risks, like deciding the safest route, minimizing the chances of collision, even if it means taking a longer path.

And then lastly, lifecycle management. Whenever we develop a system, we never develop anything perfect. So we still need to approve it over time, re-update the use case, revisit all aspects and simply make it better.

So now that we have a clear idea of ethical use of AI, let’s head on to some final thoughts. So this is actually our future, AI is there. we will actually get here one way or another. It’s either we embrace this change or resist it until it’s too late. We have to also think that AI is just a tool. AI can be your companion. It’s not meant to replace us, it’s meant to help us so that we can focus on the more important things. Things like being creative, being innovative, these are things that AI cannot do.

Of course, we always have to consider responsible use of AI. You are still the principal investigator or the researcher. You are still the person who comes up with the idea. You’re just using AI to help you. And there is actually no gain in prohibition. AI is here to stay. And we really need to learn how to coexist with AI. AI can actually help us as a tool. If we embrace it, it can help us reach new heights. So in everything that we’ve discussed, there must always be a balance. We must always consider the way we use AI. We have to use AI wisely, balance it with our own skills, skills that are unique to us humans, us innovators, us game changers, us leaders of tomorrow.

So with this, I want to end my presentation with these parting words. Be curious. embrace change, and lead the change. And I hope you all learned something today, and thank you for your attention.

Tags: AI, artificial intelligence, data, Dr. Gabriel Avelino Sampedro, FICS Masterclass, masterclass, Meet Your New Lab Partner - AI as a Research Assistant in Data Science, research, science, upou