Select Page

Original & Concise Bullet Point Briefs

Monte Carlo Methods : Data Science Basics

Exploring the Benefits of the Monte Carlo Method in Data Science

  • The Monte Carlo method is a type of data science that can be used to solve problems
  • It involves simulating the problem and seeing how many times it returns the desired result
  • This can be done even when other methods may not be applicable or known
  • An example of this was given, where a probability question was solved through simulation rather than relying on geometry
  • Another example was given which highlighted why one might choose to use a Monte Carlo method – as it can provide an accurate answer for more complex problems.

Expected Number of Rounds in a Game With Two Losses Revealed Through Simulations

  • The expected number of rounds one plays in a game with two losses in a row is two minus p divided by one minus p squared
  • There are three cases to consider when calculating the expected number, each with its own probability: 1) winning the first round (p), 2) losing both rounds (1-p*1-p), 3) losing then winning (1-p*p)
  • A million simulations were coded up to calculate this expected value, where in each simulation r starts at zero and is incremented until two losses are recorded
  • N_loss tracks how many losses have been recorded so far.

Exploring Monte Carlo Methods: A Solution to Complex Problems

  • Monte Carlo methods can offer an easy and accessible solution to complex problems
  • They are not always fast, as the complexity of a problem or the parameters put in affect the speed of the simulations
  • Monte Carlo methods can be a great tool for those unfamiliar with the subject matter, requiring only knowledge of the problem’s rules to code a solution.

The Pros and Cons of Monte Carlo Methods for Problem Solving

  • Monte Carlo methods can provide quick solutions to problems approximately
  • However, they are not generalizable or interpretable
  • They also lack the ability to explain how a result was achieved
  • Monte Carlo methods have both pros and cons that should be considered when deciding how to tackle a problem.

Original & Concise Bullet Point Briefs

With VidCatter’s AI technology, you can get original briefs in easy-to-read bullet points within seconds. Our platform is also highly customizable, making it perfect for students, executives, and anyone who needs to extract important information from video or audio content quickly.

  • Scroll through to check it out for yourself!
  • Original summaries that highlight the key points of your content
  • Customizable to fit your specific needs
  • AI-powered technology that ensures accuracy and comprehensiveness
  • Scroll through to check it out for yourself!
  • Original summaries that highlight the key points of your content
  • Customizable to fit your specific needs
  • AI-powered technology that ensures accuracy and comprehensiveness

Unlock the Power of Efficiency: Get Briefed, Don’t Skim or Watch!

Experience the power of instant video insights with VidCatter! Don’t waste valuable time watching lengthy videos. Our AI-powered platform generates concise summaries that let you read, not watch. Stay informed, save time, and extract key information effortlessly.

[Music]hey everyone welcome back so in thisvideo we're going to be talking about avery cool concept in data science andone that gets used a lotcalled the monte carlo method you know ilike to keep it really practical on thischannel so we're going to go throughtwo real examples and then we'll finishby talking about the advantagesand also most importantly thedisadvantages of using amonte carlo method to solve your problemversus some other way to solve yourproblem so let's jump right into thefirst exampleso this is kind of a toy example we arethrowing dartsinto this square whose sides have lengthtwo so you throw a dart and it'sguaranteed to be somewhere in the squarebut it is random where it's going to beyou're just throwing a dart randomlyand you can see there's also a circleinside of the square the circle hasradius oneand the natural question of course is ifi throw a dart into the squarewhat's the probability that the dart isin the circlei think most of us after we've taken ageometry and probability course it'spretty natural how to answer thisquestionbecause the probability of throwing itin the circle is just proportional tothe area of the circleso if we divide the area of the circleby the area of the square we shouldget our answer and what would that bethe area of the circle is pir squared or pi 1 squared which is justpiand divide that by the area of thesquare which is 2 times 2 which is 4. soour answer is pi divided by 4.okay so pi over 4 is the answer nowlet's consider a different way to solvethis problem that doesn't take intoaccount that we know anything aboutgeometry at allwe're going to solve this purely in asimulation based wayand i've written the pseudo code forthat down here in greennow let's go through the code because iwant to make sure to understand the codefor both this one and the next one sothat we understand how you wouldactually write the code for a montecarlo simulationmonte carlo method so we're going to setbig nequal to 1 million this is going to bethe number of points that we are goingto samplefrom the square and each point is goingto represent a simulated dart throwwe're going to set this variable circequal to zero this is going to be arunning totalof how many of those darts or how manyof the sampled pointsalso live inside the circle and you cansee that after we're done after we dothis simulationwe're simply going to divide circ by nbecause that's going to give us thefraction of pointsthe fraction of darts that i randomlythrew that end up in the circle which isexactly what we want that's exactly theprobability that we're afterit's just that we solved itgeometrically very naturally beforebut now we're going to solve it purelyin terms of simulation i think this is avery intuitive thing for a lot of peoplemaybe this is even the first thing youthought of to solve the problemlet's just look at the pseudo codereally quickly so we're going to say foriin 1 to n again we're doing a millionsamples and this is important becausethe more samples we dothe closer we are going to get to thetruth which is pi over 4.if we only took 10 samples we'reprobably not going to get that close100 samples is better but maybe notenough doing a million samples should besufficient but of course we alsoknow that more samples means moreruntime more on that laterso we're going to say for i and 1 to nwe're going to sample the x and ycoordinate from negative 1 to x and y are both independently goingto be numbers that are sampled frombetween negative 1and 1 uniformly y between negative 1 and1 so i arbitrarily just define acoordinate system where thecenter of this diagram is the origin andso if you kind of extrapolate from therethen the left hand side of the square isat negative onethe right hand side of the square is apositive one and same thing for they direction so sampling x and y fromnegative 1 to 1 is going to exactlyuniformly samplewithin the square now the final questionisdoes the point that i just uniformlysampled for a certain iteration of thisfor loopalso live inside the circle and we knowthatsomething lives in a circle if it's xsquared plus y squared is less than orequal to the radius of the circlethe radius of the circle here is one sowe simply need to check the conditionthatif x squared plus y squared is less thanor equal to onethat means it lives in the circle thencircplus equals one circus again the runningtotalof the number of darts we throw thatlive inside the circle sogo ahead and pause go over theexplanation again or write it downconvince for yourself or even code it ifyou wantand you'll see that at the end of theday circ is the number of points that wesimulated that live in the circlen is the total number of points that wesimulated so thisdivision here makes a lot of sense andnow i actually coded this myself and i'mgoing to throw the answer up on thescreen here for theestimate i got from the simulation andyou can see it's extremely close to thetrue valueand it doesn't take too long to computeeven for 1 million samples so this is agood way to solve the problemif we truly had no idea how to solve itany other way so we'll have a little bitmore to say about this when we talkabout the pros and cons but let's moveon to the next example because i thinkthis one better highlights why we wouldchoose to use a monte carlo methodbecausethis problem wasn't too hard so it isdifficult to justify writing code for itbut let's look at a harder problem whereif we just don't know what to dowe can still get a really close answerusing monte carlo methodso example number two says that i play agamewhere the probability of winning a roundso probability of win is equal to littlepfor each round so for any given round ofthe game the probability that i win thatround isp now this game ends when i losetwo times in a row so for example if iplay the first round and i winsecond round i lose and the third roundi lose the game is overand i've played three total rounds thequestionis if this is how the game works what isthe expectednumber of rounds that i'm going to playanother way to say that is on averagehow many rounds of this game do i playuntil it's overuntil i get those two losses in a rownow before i go on i mean you can eventhink for yourself how would i solvethis problem whattools would i use what formula would iwrite how would i think about itand before we get to the solution here ithink the natural way to solve it andthe way i firstyou know kind of started writingequations down was okay we know anexpected value is simplyone times the probability that i playone round plus twotimes the probability that i play onlytwo rounds plus three times theprobability i play three rounds and onand on so maybe if i add up all thesenumbers in this kind of infinite sumi'll see some kind of pattern and i canget it into a nice closed formnow i indeed tried that but it was alittle bit difficult the math got alittle bit out of hand maybe you'll havebetter luck with itbut then you know a little while later ihad this new ideaand i want to emphasize that even thoughi'm sharing this as a verysimple looking solution it took me areally long time to arrive at it it tookme a long time to think about itand even when i did think about it ittakes some time to work through all thealgebra hereso do keep in mind that there is a bigtime cost to solving this problemin a purely kind of algebraictheoretical way but let's look at thesolution that i came up withand see why it works so we're going tolet little ebe the expected number of rounds thatyou're going to play that is thequantity that we're afterof course now it turns out that you canbreak this up into three very simplecasesthe first case where you win the firstround so let's say you just startplaying the game and you win the firstroundnow that's as if the game started fromscratch becausethat is basically saying that nowwhatever comes after i haven't had anylosses yetso i'm basically starting the game fromscratch it's as if nothing happenedand that event where i win the firstround happens with probability pby definition and what is the expectednumber of rounds that i play in thatcasewell it's going to be 1 because i didtake that first round which i wonplus e so it's kind of kind of difficultto understand this i would say it's notsuper easy but i'mdefining this quantity e in a recursiveway where you see eon the left hand side and you see itshow up in the right hand side of theequation as well but againthe first part of the sum is sayingthere's probability p that i win thefirst roundin that case it's like the game startsfrom scratch so the expected number ofrounds in that situationwould be one plus e e because i'mstarting the game from scratch now let'slook at the two other casesanother case is if i get loss loss so ibegin the game i get a lossi unfortunately get another loss and thegame is over this is the easiest of thethree cases to understandthe probability of this is one minus ptimes one minus pagain by definition and what is theexpected number of rounds wellit could only be two because once i'vegotten those two losses it's overso i put two here and the final casewould be if i get aloss first and then i get a winthat happens with probability one minusp times pso that again is getting a loss followedby getting a winand what's the expected number of roundsi play in that situationwell i've already played two rounds toget the loss and the winso that's two and then to that i addlittle elittle e again because i'm starting fromscratch i got a winon the second round so it's like i'mstarting everything from scratchnow if you do all the algebra you cansolve for e it's not too difficultyou'll get the closed form solutionthat little e is equal to two minus pdivided by one minus psquared now i wanna be really clear herethis math hereif you didn't follow it it's okay themain point i wanted to get acrossis that it took me a while to explainthis to you it took me a while to workout the algebra and it took me a longtime to even arrive atthe fact that this could be the solutionto the problem nowwhat if instead of doing all that whatif instead of thinking about itat all we just coded it we tookadvantage of the high computing power wehave nowadays and we justtook these rules that we were given inblue and coded up a solutionand then estimated the solution that wayso i want to explain the code to youagain this ispseudo code but it wouldn't be too hardfor you to write this yourselfso i'm going to initialize the rounds tobeempty this is going to store the numberof rounds that i play on varioussimulationsfor i in 1 to 1 million so i'm againdoing 1 million trials 1 millionsimulations of this game and on eachsimulation i'm going to see how manyrounds i got i got to playso what do i do inside the for loop iset r equals zeror is going to be the number of rounds iplayed on this simulation so currentlyit'szero nothing and n loss is going to bezero this is the number of lossesthat i have gotten in a row so far nowwe have aninner while loop here so while thenumber of losses is not equal to 2again that's straight from thedefinition of the problem while i havenot hit two losses in a rowr plus equals one so this says thatupdate the number of rounds i've playedby oneso this just keeps going as long as idon't have two losses in a rowthen i'm going to simulate a randomnumber between zero and one and then i'mgoing to check if this random number isless than p pagain is the probability of winning around so for example if p was equal toone halfand this random number that i chose inthis step is less than one halfthen we would say that i won the roundand therefore we reset and loss to bezerobecause when you win a round of thisgame the number of losses you have in arow by definition is now zerohowever else which means that we did notwin that roundthen we would update and loss to be plusequals one that means that i do have oneextra loss in a rowand we just keep going and eventually nloss is going to be equal to two becausei got two losses in a rowi do rounds dot append r so again roundsis a big list that throughout this wholesimulation it's collecting the number ofrounds that i played in each gameso we append this little r to roundsand at the very end of the day at thevery end of this for loop after i'vedone mymillion simulations of this game ireturnthe average of rounds so the bar you seeon top is an averageand that's exactly what i want what isthe expected number of rounds that iplay in this gameif these are the rules of the game solet's take a look at the results of thismonte carlo method if pis equal to one half which is theprobability of winningthen the true is going to be sixcomputed straight from this analyticalformula we derivedand i've showed the results of the montecarlo simulation here so we see we getextremely close to the truths butdo notice that it took me about 10seconds to get there so it wasn'tsuper fast 10 seconds is not a long timeyou can sit there it goes by in a flashbutnotice that it took quite a bit longerto do a million rounds of thismonte carlo simulation than this mucheasiermonte carlo simulation and we'll touchon that point again in a moment okay sothis was our second example and the bigtakeaway i want you to get from this isnot you know thisexample at all this is not a probabilitysolving videothis is a video which is trying to showyou that some problems can require a lotof thinking clever problem solving kindof bashing your head on the paper for awhile can't get the answerbut if all you're after is that numberall you're after is the expected numberof rounds for some given value of pthen you might as well just code asolution this didn't take too long towriteand then you'll get pretty close to thatanswer and if you want to get evencloser justmake the number of rounds bigger you canget even closer arbitrarily close as youwantif you're willing to wait that long foryour process to finishso monte carlo methods can be a veryeasy way to get the answerto a problem which is otherwisedifficult to solveand that's a great segue into talkingabout whatmc is mc is monte carlo what it is andwhat it is notbecause so far everything i've told youmakes it sound awesomebut we like to think about things fromevery angle on this channel so let'stalk about pros and cons now to closethe video so as we were saying montecarlois easy even if you haven't thought of aclever way to solve the problem you canjust code it yourself and get the answerto some arbitrary degree of precisionand kind of to talk about that morelet's say that it's a problem in asubject area that you're not familiarwith like genetics orstock trading or something as long asyou have the rules of the problemit doesn't really matter if you're asubject matter expert or not you canjust code this and get the answeranyways soit helps people who are in maybe notnecessarily the same fieldstill arrive at practical results tothese problems sothat is one of the big pros of montecarlo methodsand this easy i put in quotes becausesometimes the code you writedoes get kind of long kind of out ofcontrol so you have to be careful therebut it can be easier than going throughthe calculus or linear algebra orwhatever you needto solve the problem analytically and ialso put sometimes fastit took not too long to get my answerbut it's notalways going to be fast and that leadsme into these next two points which i'lltalk more aboutand this is what monte carlo methods arenot soit isn't always fast so i just said itwas sometimes fast now i'm saying it'snot always fast but the point i want toget acrossis that it seemed reasonably fast forour examples herebut it really depends on the exactproblem you have even if we look at thisproblem stillif we look at a p-value that's notone-half if you look at something biggerthan a half you're going to see problemsstart to ariseand to get an idea of that i've graphedthis equation so this equation is againthe analytical solution to ourproblem number 2 here and this graphlooks like thiswhat this is telling me is that as theprobability goes from zero toone the expected number of rounds i'mgoing to play this gamegoes to infinity that is not good newsfor the monte carlo simulation ifp is indeed large what that means isthat when i'm inside this loopthis is going to run for a long time ifp is large because it'sbarely ever going to see two losses in arow indeed if i put up the full table ofresults here for p is equal to one halfp is equal to 0.75 and p is equal to 0.9you see this big issue of how long it'staking it's still getting close to thetruth in all casesbut it is taking longer and longer andlonger to get me the answer in fact if ipush this to the extreme pis equal to 0.99999 this is going totake me hours or evendays to finish depending on how big i'veset p and that can be seen analyticallyhereit's going up like this it's not goingup linearly or anything nice like thatit's going up in thisvery nasty exponential type curveso that's why monte carlo methods arenot always fast they can be fastdependingon the exact parameters you put in butdepending on the complexity of theproblem or the exact parameters you putinthey can take a really long time andbecome impractical to useand kind of just a side note on that ididn't write this but fordifferent values of p notice that thiswhole monte carlo loopwas only for a single value of p soafter i do this whole loop and i wait 10secondsi get the answer for p is equal to onehalf but what if i want the answer for pis equal to one fourth or p is equal to0.1 then i have to run this over andover and over againanother disadvantage of monte carlomethods is that they're notgeneralizable just because you have aresult for some value of your parameterdoesn't mean you can say anything aboutthe result for a differentvalue of your parameter on the otherhand if we solve this analytically andgot this closed formulawe can get all of the answers in a veryshort amount of time we can even graphthem like this and see what the trend isthat's just something you cannot getwith monte carlo methodsand that leads real well into the verylast and in my opinion the mostimportant disadvantage to keep in mindabout monte carlo methods which is thatthey are notinterpretable so true if you care aboutonly the final number and getting closeto the final number thenmonte carlo methods could be a good toolbut if you care at allabout how you got there or interpretingthe resultas we often do in data science and mathand stats thenmonte carlo methods unfortunately arenot your friendnotice that even in this very simpleexample knowing that the answer is piover 4tells you about how we got to the answerbecause when you see a pieyou know there was a circle involvedmaybe the area of the circleand you look at this 4 and you would sayoh that's 2 times 2 that's the area ofthe squareso looking at this true form tells youthe whole story about how we arrived atthe solution and the nature of theproblemlooking at just the numerical answermight be enough for somepractical purpose but it doesn't helpyou understand the nature of the problemwhich could be a bigissue down the road if you're trying togeneralize this ideaand similar notes here i think we'vekind of beaten this example to death butnote that having the exact number inthis case doesn't tell you anythingaboutthe formula or this process at arrivingat that final numberso monte carlo methods unfortunately arenot interpretable that's the price youpay for getting a quick solution to theproblem which is approximately correctis thatyou have absolutely zero idea how yougot there sothese are what monte carlo method is andwhat monte carlo methodsare not so i want you to keep all thatin mind when you decide whether i'mgoing to use themor if i'm going to tackle the problem ina more analytic way from firstprinciples both have their prosand cons and depend on your situationokay so that's all i had to say aboutmonte carlo methodsif you want to see a coding video onmonte carlo methods please let me knowotherwise please like and subscribe formore videos just like this andsee you next time