Hello, world, it’s Siraj and I’m gon na show you how I read research papers and give you some additional tips on how you can consume them more efficiently. Reading research papers is an art whether the topic is machine, learning or cryptography, distributed consensus or networking in order to truly have an educated opinion on a particular topic in computer science. You’Ve got to get yourself acquainted with current research in that subfield. It’S easy to agree with a claim if it’s got enough hype behind it, but being critical and balanced in your assessment is a skill that can be learned. Phd students are taught how to do this in grad school, but you too can learn how to do this. It just takes patience and practice and coffee lots of coffee.
Every single week I read between 10 to 20 research papers in order to keep up with the field and I’ve gotten better at it over time, and I don’t have any graduate degrees, I’m just a guy who really loves this stuff, and I teach myself everything using Our new collective University, the Internet, one of my favorite resources to find papers on machine learning is the machine learning subreddit people post papers. They find interesting every day and they’ve also got this cool weekly. What are you reading thread where people post the papers that interest them the most currently? Additionally, there is this web app called archived sanity com created by Andrey Karpov II, which basically goes through archive and finds the papers that are most relevant. You can filter them by what interests you by which ones are most popular or by the ones that are most cited.
Lately, Google and deepmind respectively publish their work on their websites for easy access, and there are, of course, journals like Nature that you can find some top papers in easily. The pace of research is accelerating in machine learning because of a few reasons, not including Smith. You in academia and in the public sphere, the democratization of data computing power, education and algorithms is all steadily happening over the internet because of this more people are able to make their own insights into this field in the industry. The big tech companies profit more when their own teams discover new machine learning methods, so there’s this race to create faster, more intelligent algorithms. All that is to say that there are a lot of papers.
You could be reading right now. So how are you supposed to know what to read well, what I found is that every week there are maybe two or three papers that are getting the most attention in machine learning and the tools I’ve mentioned helped me find them and read them. But most of my reading is a result of me having a goal. That goal could be to learn more about activation functions or perhaps probabilistic models that use attention mechanisms. Once I’ve got that goal. It makes it much easier to create a reading strategy that points towards that goal, just being a good math heavy machine learning paper reader is not a goal to aspire to.
Your stamina is more of a function of human motivation, which is a function of the goals. You’Re trying to accomplish, I found that I can crush through and understand the most difficult papers. Much more when I have a real reason to do so.
So let’s take the landmark paper by a friend of mine, Ian good fellow on generative adversarial networks. As an example, there is a lot in this paper. He synthesizes some ideas here that made Yamla kun say that this concept was the coolest idea in deep learning in the last ten years. The way I read papers is by performing a three pass approach on the first pass, I’ll just skim through the paper to get it just of it, meaning I’ll. First read the title: if the title sounds interesting and relevant: generative adversarial networks, yo, let’s go I’ll, read the abstract.
The abstract acts as a short standalone summary of the work of the paper that people can use as an overview. If the abstract is compelling an adversarial process between two neural networks that were temples a game all right, this is lit then I’ll skim through the rest of the paper. By that I mean I’ll carefully read the introduction, then read the section and subsection headings, but ignore everything else mainly ignore the math. I never read the math on the first pass I’ll read the conclusion at the end and maybe glance over the references mentally ticking off the ones I’ve already read. If there are any, I just assume the math is correct on the first pass. My goal for this first pass is to just be able to understand the aims of the author.
What are the papers main contributions here? What problems does it attempt to solve? Is this a paper I’m actually interested in reading more of once, I’ve done the first pass I’ll go back to see what other people are saying about this paper and compare my initial observations to theirs. Basically, the aim of this first pass is to ensure that it’s worth my time to continue analyzing. This paper live short and there are too many things to read.
If it does pique my interest, then I’ll reread it a second time on the second pass I’ll read it again, this time more critically and I’ll also take notes, as I go I’ll, actually read all the English text and I’ll try to get a high level understanding Of the math that’s happening in the paper, so it’s a minimax game that looks to optimize a Nash equilibrium. Okay, I kind of get that eventually, the generator Network creates fake samples that are indistinguishable from the real thing. So the discriminator is powerless cool. I’Ll read the figure descriptions, any plots and graphs that are available and try to understand the algorithm at a high level. A lot of times the author will break down an equation by factoring it out. I avoid trying to analyze this on the second pass.
I see that it’s using a loss function called the kullback-leibler divergence never heard of that one, but I do get the concept of minimizing a loss function. When I read the experiments I’ll try to evaluate the results, are they repeatable? Are the findings well supported by evidence once I’ve done that? Hopefully, there is some associated code with the repository available on github I’ll download the code and start reading it myself I’ll, try to compile and run the code locally to replicate the results, as well, usually comments in the code help further.
My understanding I’ll also look for any additional resources on the web that help further explain the text. Articles summaries, tutorials, usually a popular paper, will have a breakdown that someone else has done online. That will help drive the key points home for me after this second pass. I’Ll have a Jupiter notebook, full of notes and associated helper images, since I teach this stuff on YouTube, teaching is really the best way to fully understand any topic when it comes to the third pass.
It’S all about the math. My focus on the third pass is to really understand every detail of the math. I might just use a pen and paper and break down the equations in the paper myself I’ll use Wikipedia to help me understand any of the more formal math concepts fully alike. The KL divergence and, if I’m feeling really ambitious I’ll, try to replicate the paper programmatically using the hyper parameter, settings and equations that it describes after all of this I’ll, feel confident enough to discuss it with other people.
Greeting papers is not easy and nobody can read long manipulations of complicated equations fast. The key is to never give up turn your frustrations into fuel. To get better. You will understand this paper. You will master this subject. You will become awesome at this.
It gets easier. Every time as you build your merkel, dag of knowledge, see what I did there. If you don’t get a math concept, guess what Khan Academy will teach you anything you need to know for free and, lastly, do not hesitate to ask for help. There are study groups and communities online that are centered around the latest research in machine learning that you can post your questions, to don’t be afraid to reach out to researchers as well, you’re, actually doing them a favor by having them explain to you in terms. You understand all scientists need more experience, translating complex topics. I’Ve got lots of great links for you in the description, and I hope you found this video useful.
If you want to learn more about machine learning, AI and block technology hit the subscribe button, and for now I’ve got to reread the capsule Network paper. So thanks for watching