.. _tut_job_env: Modify the execution environment ================================ All executable units in the JIP system are attached to a specific :py:class:`~jip.profiles.Profile` that can be modified from within the tool implementation as well as at submission or execution time. The Profile ----------- The :py:mod:`jip.profiles` module documentation covers the properties that you can set and modify. Here, we will go over some of them to explain how and in which order a certain profile is applied to a job at execution time. The Job name ************ Job profiles of a tool can be accessed from within the tools *setup* or *validate* block. This is important because the properties of the profile are applied as submission time properties. That means they have to be set when the tool is *submitted* to your compute cluster. You can set the job name using either the ``name()`` function or the ``job`` attribute. For example:: >>> from jip import * >>> @tool() ... class MyTool(object): ... def setup(self): ... self.name("Custom-Name") ... # this is equivalent ... self.job.name = "Custom-Name" ... def get_command(self): ... return "true" >>> p = Pipeline() >>> node = p.run('MyTool') >>> jobs = create_jobs(p) >>> assert jobs[0].name == 'Custom-Name' Your tool, and therefor the job that is be created from the tool will have a given name, in this case "Custom-Name". The same technique works for pipelines, but they will alter the pipeline name not the names of the embedded jobs within the pipeline:: >>> from jip import * >>> @pipeline() ... class MyPipeline(object): ... def setup(self): ... self.name("The-Pipeline") ... ... def pipeline(self): ... p = Pipeline() ... p.run("MyTool") ... return p ... >>> p = Pipeline() >>> node = p.run('MyPipeline') >>> jobs = create_jobs(p) >>> assert jobs[0].pipeline == 'The-Pipeline' >>> assert jobs[0].name == 'Custom-Name' If you run "MyPipeline", the pipeline name will be set to "The-Pipeline", but the jobs' name will still be "Custom-Name". You can however change the job names when you construct the pipeline. For this, you can create a custom profile that will be applied to the job. For example:: >>> @pipeline() ... class MyPipeline(object): ... def setup(self): ... self.name("The-Pipeline") ... ... def pipeline(self): ... p = Pipeline() ... p.job("The-Tool").run("MyTool") ... return p ... >>> p = Pipeline() >>> node = p.run('MyPipeline') >>> jobs = create_jobs(p) >>> assert jobs[0].pipeline == 'The-Pipeline' >>> assert jobs[0].name == 'The-Tool' Here, we used ``p.job("The-Tool")`` to create a custom environment with a specified name. Then we ``run`` a tool *in* that environment. Please keep in mind that *multiplexing* alters jobs names. If a pipeline contains two jobs with the same name, their names will be suffixed with a counter starting at '0' and applied in the order the nodes where added to the pipeline. For example:: >>> @tool() ... class MyTool(object): ... """My tool ... usage: ... mytool ... """ ... def setup(self): ... self.name("MyName") ... ... def get_command(self): ... return "true" ... >>> @pipeline() ... class MyPipeline(object): ... def setup(self): ... self.name("The-Pipeline") ... ... def pipeline(self): ... p = Pipeline() ... p.run("MyTool", input=['A', 'B']) ... return p >>> p = Pipeline() >>> node = p.run('MyPipeline') >>> jobs = create_jobs(p, validate=False) >>> assert jobs[0].pipeline == 'The-Pipeline' >>> assert jobs[0].name == 'MyName.0' >>> assert jobs[1].pipeline == 'The-Pipeline' >>> assert jobs[1].name == 'MyName.1' In this example, we call ``MyTool`` with two input values ``A`` and ``B``. That causes the multiplexing to kick in and results in two jobs are created: ``MyTool.0`` and ``MyTool.1``. With multiplexing in place it can be useful not to apply fixed names or other environment properties to your nodes, but use templates to customize, for example, your job names according to the tools options. Take the example from above. We can use the jobs input option to construct more meaningful job names: .. code-block:: python :emphasize-lines: 8,8 >>> @tool() ... class MyTool(object): ... """My tool ... usage: ... mytool ... """ ... def setup(self): ... self.name("${input|name}") ... ... def get_command(self): ... return "true" >>> @pipeline() ... class MyPipeline(object): ... def setup(self): ... self.name("The-Pipeline") ... ... def pipeline(self): ... p = Pipeline() ... p.run("MyTool", input=['A', 'B']) ... return p >>> p = Pipeline() >>> node = p.run('MyPipeline') >>> jobs = create_jobs(p, validate=False) >>> assert jobs[0].pipeline == 'The-Pipeline' >>> assert jobs[0].name == 'A' >>> assert jobs[1].pipeline == 'The-Pipeline' >>> assert jobs[1].name == 'B' In this example, the created jobs will have the names 'A' and 'B'. .. note:: In order to assign the node names, we make use of the ``name`` :ref:`template filter `. We do so because JIP operates on **absolute** paths of input and output files and we only want to use the base name if the input file here. Custom profiles in pipelines **************************** We have seen before that we can use the ``job()`` function to create a custom profile and then run a job with that profile. In fact, you can use this to create a set of profiles in your pipeline and then run different jobs with different profiles. For example, assume that you have a few tools that need more CPU's and, in your environment, they have to be submitted to a specific *queue*. Other jobs should just run with a *default* profile. You could do something like this in a JIP script (you can call the same functions on a pipeline directly). .. code-block:: python slow = job(threads=8, queue="slow_queue", time="8h") fast = job(threads=1, queue="fast_queue", time="1h") europe = slow.run('predict_weather', location='Europe') america = slow.run('predict_weather', location='America') stats = fast.run('weather_stats', predictions=[europe, america]) In this example, we create two job profiles, one for slow, multi-threaded jobs, and one for fast jobs. We can then run tools, here *predict_weather* and *weather_stats* using these dedicated profiles. The profiles themselves are again callable. That means you can further customize them. Say you want to assign names to the jobs and set a higher priority to one of them: .. code-block:: python europe = slow("EU", priority=20).run('predict_weather', location='Europe') america = slow("USA", priority=10).run('predict_weather', location='America')