{"id":275,"date":"2021-01-08T21:15:04","date_gmt":"2021-01-08T20:15:04","guid":{"rendered":"https:\/\/logbooks.ifosim.org\/pykat\/?p=275"},"modified":"2021-01-08T21:15:04","modified_gmt":"2021-01-08T20:15:04","slug":"parallelising-finesse-2-pykat-three-ways","status":"publish","type":"post","link":"https:\/\/logbooks.ifosim.org\/pykat\/blog\/parallelising-finesse-2-pykat-three-ways\/","title":{"rendered":"Parallelising Finesse 2 + PyKat: three ways"},"content":{"rendered":"\n<p>Occasionally you might encounter a situation where you need to run the same simulation lots of times with slightly different configurations. In cases where this also involves computationally expensive things (many orders of higher order modes, very high resolution axes, <code>lock<\/code> commands, maps,&#8230;), that can get quite tedious and you might wonder if you can parallelise your code to speed things up a bit. <\/p>\n\n\n\n<p>With the caveat that you should first do a sanity check as to whether your code is already set up to run as fast as it could per-case*, the short answer is yes.<\/p>\n\n\n\n<p>(*e.g. could your code be tuned closer to an operating point so the <code>lock<\/code>s don&#8217;t have to work as hard, do you actually <em>need<\/em> to run with <code>maxtem 20<\/code> or do your results converge by <code>maxtem 4<\/code>, etc.)<\/p>\n\n\n\n<p>Here are 3 methods (out of probably several more) that can be used to run Finesse 2 simulations in parallel using Python:<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li><a href=\"#parakat\" data-type=\"internal\" data-id=\"#parakat\">Built-in method: <code>parakat<\/code><\/a><\/li><li><a href=\"#ipyparallel\" data-type=\"internal\" data-id=\"#ipyparallel\"><code>ipyparallel<\/code> (the package parakat is built on)<\/a><\/li><li><code><a href=\"#multiprocessing\">multiprocessing<\/a><\/code><\/li><\/ol>\n\n\n\n<p>Below I&#8217;ll assume you are working in a Jupyter Notebook; the same methods can be used directly in a python script. Downloadable examples of everything covered here are linked <a href=\"#downloads\">at the end<\/a>, along with some thoughts on <a href=\"#which-method\">which method might suit<\/a> your needs best.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">demo code used by all examples<\/h2>\n\n\n\n<p>Let&#8217;s use the same basic example in all cases. For parakat we need the kat code separate, for the other two we&#8217;ll need it inside a function. <\/p>\n\n\n\n<p>In this example we parse the Finesse code for a Gaussian beam propagating to a mirror, and measure the power transmitted through the mirror as the laser&#8217;s power is varied linearly from 1W to 100W. Each parallel job then repeats this process for a different value of mirror transmission. <\/p>\n\n\n\n<p>This isn&#8217;t something you would normally need to parallelise, but works for the purposes of testing the different methods quickly.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>n_eng = 8 # max number of parallel tasks\nxsteps = 10000  # xaxis resolution in each job\nn_jobs = 100    # total number of different jobs we need to iterate through\n\nkatcode = f\"\"\"\nl L0 1 0 n0\ngauss mybeam L0 n0 100u 0 101u 1m\nmaxtem 4\ns s0 0 n0 n1\nm m 0.5 0.5 0 n1 nout\npd P nout\nxaxis L0 P lin 1 100 {xsteps}\"\"\"\n\nvals = range(1,n_jobs)\ntestvals = &#091;v\/n_jobs for v in vals]<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"parakat\">ParaKat<\/h2>\n\n\n\n<p>For most simple cases, ParaKat is the recommended method. ParaKat only does the explicit <em>Finesse<\/em> calculation in parallel, returning a list of <code>out<\/code> objects, so it&#8217;s functionality is a little limited. However this also usually makes it easiest to work with and includes some nice UI features like a progress bar \ud83d\ude42<\/p>\n\n\n\n<p>ParaKat is based on <code>ipyparallel<\/code>. Like the more explicit usage we&#8217;ll see below, this relies on first externally starting up a cluster of engines using <code>ipcluster<\/code>, which can then be assigned jobs by your code. <code>ipcluser<\/code> (installable e.g. via <code>conda<\/code>) is intended to be used through the terminal. If using Jupyter notebooks \/ Jupyterlab, you can always just launch a terminal window there and run the command<\/p>\n\n\n\n<p> <code>ipcluster start -n   <\/code>[number of engines you want]<code> --daemonize<\/code><\/p>\n\n\n\n<p>However, we can also do this directly in-notebook, either using the <code>!<\/code> flag, or slightly more pythonically (and controllably) using <code>subprocess<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import subprocess\nsubprocess.run(f\"ipcluster start -n {n_eng} --daemonize\",shell=True) #daemonize and Popen both make this happen in the background\ntime.sleep(10) #wait a sec for the cluster to start up<\/code><\/pre>\n\n\n\n<p>Now that the cluster is up and running, we set up the code as usual using Pykat:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>base=finesse.kat()\nbase.verbose=False\nbase.parse(katcode)<\/code><\/pre>\n\n\n\n<p>and now we work slightly differently to a serial Pykat run:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from pykat.parallel import parakat\n\npk = parakat()\nfor T in testvals:\n    kat = base.deepcopy()\n    kat.m.setRTL(1-T,T,0)\n    pk.run(kat)\nouts = pk.getResults() <\/code><\/pre>\n\n\n\n<p>Note that we still have a <code>for<\/code> loop here, but that now this is just used to create and collate the various <code>kat<\/code> objects. No calculations occur until the command <code>pk.getResults()<\/code>.<\/p>\n\n\n\n<p>If <code>subprocess<\/code> was used to launch the engines, we can now also conveniently use it to stop them:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>subprocess.run(f\"ipcluster stop\",shell=True)\ntime.sleep(10) #wait a sec for the cluster to shut down<\/code><\/pre>\n\n\n\n<p>Plotting the results (or otherwise working with the outputs of the runs) is then just a case of accessing the relevant list item in <code>outs<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>for o in outs:\n    plt.plot(o.x,o&#091;'P'])<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"ipyparallel\">ipyparallel<\/h2>\n\n\n\n<p>ParaKat is built on <code>ipyparallel<\/code>, so the usage is quite similar between the two. Using <code>ipyparallel<\/code> directly gives you more flexibility, since we are directly parsing the python\/pykat code of choice. It could therefore be extended to include further post-processing, multiple kat runs, etc.<\/p>\n\n\n\n<p>As above, we need to use <code>ipcluster<\/code> to externally launch the engines we want:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import subprocess\nimport time\nsubprocess.run(f\"ipcluster start -n {n_eng} --daemonize\",shell=True) #daemon and Popen both make this happen in the background\ntime.sleep(10) #wait a sec for the cluster to start up<\/code><\/pre>\n\n\n\n<p>This time, we need to create a <em>function<\/em> which describes what we want to happen in each job. Unfortunately, this can&#8217;t rely on any external dependancies, so we have to import pykat inside it, and either explicitly write out the katcode or add it as another variable (shown here):<\/p>\n\n\n\n<p><em>NB: If you are working in a notebook, it typically works best to define functions in a separate cell.<\/em><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code> def myfunc(T,code):\n    from pykat import finesse\n    k=finesse.kat()\n    k.verbose=False\n    k.parse(code)\n    k.m.setRTL(1-T,T,0)\n    o=k.run()\n    return o<\/code><\/pre>\n\n\n\n<p>To parallelise and run the code, we use <code>Client<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from ipyparallel import Client\n\nrc=Client() #class object to start the client to the parallel cluster\nlview = rc.load_balanced_view()#creates a DirectView object with load-balanced execution using all engines\nlview.block = False # if self.block is False, returns AsyncResult, else: returns actual result of f(*args, **kwargs) on the engine(s)\nresults = &#091;lview.apply_async(myfunc,yy,katcode) for yy in testvals] #easy enough to add the second 'code' arg to apply_async here\nouts = &#091;d.get() for d in results]\nrc.close()#good practice, if unessential on some local machines<\/code><\/pre>\n\n\n\n<p><em>NB <code>rc.load_balanced_view()<\/code> creates a DirectView object with load-balanced execution using all engines; if you don&#8217;t want that for e.g. memory usage reasons, use <code>rc.direct_view()<\/code> instead to skip the load balancing.<\/em><\/p>\n\n\n\n<p>As before, we now stop those engines manually and then extract the results to use as we please:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>subprocess.run(f\"ipcluster stop\",shell=True)\ntime.sleep(10) #wait a sec for the cluster to shut down\n\nfor o in outs:\n    plt.plot(o.x,o&#091;'P'])<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"multiprocessing\">multiprocessing<\/h2>\n\n\n\n<p>This is simpler and cleaner than the above, since engine start\/stop is handled internally (i.e. we don&#8217;t need <code>ipcluster<\/code> this time). However, it might be a little less flexible in what can be iterated over, and engines are restricted to running on the local machine, while <code>ipyparallel<\/code> enables you to send jobs to remote machines and more.<\/p>\n\n\n\n<p>Like <code>ipyparallel<\/code>, we need to iterate over a <em>function<\/em>; unlike ipyparallel this <em>doesn&#8217;t<\/em> seem to require pykat be defined every time, so this time we define:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code> def myfunc2(T,code):\n    k=finesse.kat()\n    k.verbose=False\n    k.parse(code)\n    k.m.setRTL(1-T,T,0)\n    o=k.run()\n    return o<\/code><\/pre>\n\n\n\n<p>then everything is handled internally, we just need <code>Pool<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from multiprocessing import Pool\n\npool = Pool(processes=n_eng)\nresults = &#091;pool.apply_async(myfunc2, args=(x,katcode)) for x in testvals]\nouts = &#091;p.get() for p in results]\npool.close()#good practice, necessary on some machines<\/code><\/pre>\n\n\n\n<p><em>NB: you can also use <code>outs = [pool.apply(myfunc2, args=(x,katcode)) for x in testvals]<\/code><br>in place of<br><code>results = [pool.apply_async(myfunc2, args=(x,katcode)) for x in testvals]<br>outs = [p.get() for p in results]<\/code><br>This locks the code to run things in order (still in parallel) rather than launching all jobs asynchronously whenever space frees up. So it&#8217;s one less line of code but slightly slower, for use in cases where synchronicity is important<\/em><\/p>\n\n\n\n<p>As coded above, plotting the results is identical to the previous methods:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code> for o in outs:\n    plt.plot(o.x,o&#091;'P'])<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"which-method\">Which method should you use?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Functionality\/Usability<\/h3>\n\n\n\n<p><code>parakat<\/code> is the method built into PyKat. For most simple cases where nothing much changes except the contents of the <code>kat<\/code> object, this will do what you need without having to learn to much about what&#8217;s happening behind the scenes. You do have to externally launch engines using ipcluster, but there are many ways to do this (including widgets that let you do this via a GUI, if you prefer).<\/p>\n\n\n\n<p><code>multiprocessing<\/code> seems best in terms of ease of use for cases where you need more to happen in each run than just return outputs from different <code>kat<\/code> objects. There&#8217;s no external clusters to manually start or stop, and you can put whatever python code you need inside the function.<\/p>\n\n\n\n<p>When you want lots of control, and\/or the ability to send your code to engines on remote machines, then <code>ipyparallel<\/code> is your best bet. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Speed<\/h3>\n\n\n\n<p><a href=\"#downloads\" data-type=\"internal\" data-id=\"#downloads\">Below<\/a> I&#8217;ve linked a script that runs the above examples for the same cases using all 3 methods. <\/p>\n\n\n\n<p>Running this multiple times, I founds that <code>parakat<\/code> tends to be the slowest, while <code>ipyparallel<\/code> and <code>multiprocessing<\/code> are fairly evenly matched (winner seems to depend on your system). I reckon the parakat slowdown is due to time taken to launch progress bars etc.<\/p>\n\n\n\n<p>I suspect if you need to run many parallel runs one after another, ipyparallel-based work will be faster overall, since you can keep the cluster running between jobs. Multiprocessing must be spawning and closing the cluster for every run, which could be less efficient longer term.<\/p>\n\n\n\n<p>Results may vary depending on what you are simulating. In all cases I suggest doing a short &#8216;dummy&#8217; run with less than 10 jobs to check your code does what you want, and get an idea of whether you should go and make a cup of tea (or go to bed) while you wait.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"downloads\">Downloads<\/h2>\n\n\n\n<div class=\"wp-block-file\"><a href=\"https:\/\/logbooks.ifosim.org\/pykat\/wp-content\/uploads\/sites\/4\/2021\/01\/Parallel_Finesse_Methods.ipynb\">Jupyter notebook of this post: Parallel_Finesse_Methods.ipynb<\/a><a href=\"https:\/\/logbooks.ifosim.org\/pykat\/wp-content\/uploads\/sites\/4\/2021\/01\/Parallel_Finesse_Methods.ipynb\" class=\"wp-block-file__button\" download>Download<\/a><\/div>\n\n\n\n<div class=\"wp-block-file\"><a href=\"https:\/\/logbooks.ifosim.org\/pykat\/wp-content\/uploads\/sites\/4\/2021\/01\/Parallel_Finesse_Methods.py\">Python-only version: Parallel_Finesse_Methods.py<\/a><a href=\"https:\/\/logbooks.ifosim.org\/pykat\/wp-content\/uploads\/sites\/4\/2021\/01\/Parallel_Finesse_Methods.py\" class=\"wp-block-file__button\" download>Download<\/a><\/div>\n\n\n\n<div class=\"wp-block-file\"><a href=\"https:\/\/logbooks.ifosim.org\/pykat\/wp-content\/uploads\/sites\/4\/2021\/01\/Parallel_Finesse_SpeedTest.py\">Speed comparison: Parallel_Finesse_SpeedTest.py<\/a><a href=\"https:\/\/logbooks.ifosim.org\/pykat\/wp-content\/uploads\/sites\/4\/2021\/01\/Parallel_Finesse_SpeedTest.py\" class=\"wp-block-file__button\" download>Download<\/a><\/div>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Occasionally you might encounter a situation where you need to run the same simulation lots of times with slightly different configurations. In cases where this also involves computationally expensive things (many orders of higher order modes, very high resolution axes, lock commands, maps,&#8230;), that can get quite tedious and you might wonder if you can [&hellip;]<\/p>\n","protected":false},"author":63,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ssl_alp_hide_revisions":false,"footnotes":"","ssl_alp_hide_crossreferences_to":false},"categories":[14],"tags":[88,89],"ssl-alp-coauthor":[70],"class_list":["post-275","post","type-post","status-publish","format-standard","hentry","category-finesse-2","tag-parakat","tag-parallel"],"_links":{"self":[{"href":"https:\/\/logbooks.ifosim.org\/pykat\/wp-json\/wp\/v2\/posts\/275","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/logbooks.ifosim.org\/pykat\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/logbooks.ifosim.org\/pykat\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/logbooks.ifosim.org\/pykat\/wp-json\/wp\/v2\/users\/63"}],"replies":[{"embeddable":true,"href":"https:\/\/logbooks.ifosim.org\/pykat\/wp-json\/wp\/v2\/comments?post=275"}],"version-history":[{"count":29,"href":"https:\/\/logbooks.ifosim.org\/pykat\/wp-json\/wp\/v2\/posts\/275\/revisions"}],"predecessor-version":[{"id":316,"href":"https:\/\/logbooks.ifosim.org\/pykat\/wp-json\/wp\/v2\/posts\/275\/revisions\/316"}],"wp:attachment":[{"href":"https:\/\/logbooks.ifosim.org\/pykat\/wp-json\/wp\/v2\/media?parent=275"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/logbooks.ifosim.org\/pykat\/wp-json\/wp\/v2\/categories?post=275"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/logbooks.ifosim.org\/pykat\/wp-json\/wp\/v2\/tags?post=275"},{"taxonomy":"ssl-alp-coauthor","embeddable":true,"href":"https:\/\/logbooks.ifosim.org\/pykat\/wp-json\/wp\/v2\/ssl-alp-coauthor?post=275"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}