Best practices for Jobs
1. Select the appropriate hardware for your job.
CPUs are suitable to perform most of the workloads. But, for large data frames or models, you would prefer to choose more RAM and large storage. For compute-heavy workloads that involve matrix operations, you can prefer to use GPUs to accelerate the process and get the results quickly. Remember that this will require writing your software such that it can utilize the full power of GPUs. You can modify the get_resource function in the autogenerated nbx_user.py file with appropriate values for your job on nbox.
2. Store the results of your job using the nbox Instance.mv method
Once your processing is complete, you should transfer your files back to persistent storage like your NimbleBox instance or Amazon S3 buckets. For sharing files or folders, you can use the nbox.Instance.mv method or use items in nbox.lib.
3. Keep track of the changes you make to your Jobs.
Only run code that is committed to an ID, and you can always revert to it if something goes wrong. Tools like Github Actions are helpful for this. All you need is the nbox package, which can be initialized using its CLI (Command-line interface), and has API/CLI parity for triggering processes.
4. Share your jobs with other members of the organization.
Deployments and Jobs are simply nbox. Operator objects are combined in a specific order so that you can share them anywhere. We recommend that organizations set up a private (or public) repository and store all their operators. They can then import the operators as submodules at the target locations.
6. Schedule jobs based on the needs
The NimbleBox app shows jobs in the UTC timezone to keep everything consistent. However, the NimbleBox app will automatically reflect the time zone detected by your browser.
7. Connect with existing Schedulers
Data-science and machine-learning workloads generally make up less than 10% of overall workloads in most companies. Thus, if you wish to have everything in a single location, you can wrap the code with your scheduling system, such as using Airflow.