IronWorker - Using Temporary Local Disk Storage

Workers can make use of a large amount of local temporary storage space that’s dedicated on a per-worker basis. You can perform almost any file operations with it that you could within a local environment.

You access this storage by making use of the variable user_dir in the worker. This variable contains the path of the directory your worker has write access to.

Saving Files to Disk

Here’s an example that downloads a file from the web and saves it in local storage. The log snippet just logs the contents of user_dir.

local_file.rb
class S3Worker < IronWorker::Base

  filepath = user_dir + "ironman.jpg"
  File.open(filepath, 'wb') do |fo|
    fo.write open("http://www.iron.io/assets/banner-mq-ironio-robot.png").read
  end

  user_files = %x[ls #{user_dir.inspect}]
  log "\nLocal Temporary Storage ('user_dir')"
  log "#{user_files}"

end

Location of Uploaded Files and Folders

The user_dir directory also contains any uploaded files that you’ve included with your code. Note that any folders or nested files will appear at the top level.

For example, let’s say you upload a file with the following structure:

merge "../site_stats/client.rb"

This file will be placed in the user_dir directory. You can make use of it there, create local/remote path references (using the local/remote query switch in your worker), or replicate the path and move the file there. (We recommend one of the first two options.)

user_dir/
  ...
  client.rb
  ...

In Ruby, to make use of the file (in the case of a code file), you would use a require_relative statement with the base path.

 require_relative './client'

Use Cases

Typical use cases might include:

  • Downloading a large product catalog or log file, parsing it, processing the data, and inserting the new data into a database
  • Downloading an image from S3, modifying it, and re-uploading it,
  • Splitting up a large video file or the results of a website crawl in S3, then creating and queuing multiple workers to process each video or page slice.

##Best Practices

This is temporary storage and only available while the worker is running. You’ll want to make use of databases and object stores to persist any data the worker produces.

We recommend that you not pass any large data objects or data files in to workers, but instead use object storage solutions like AWS S3 or databases. To do this, just upload your data to S3 or store it in the database from your app, then pass the identifier of the object to the worker. The worker can then access the data from the data store. This is more efficient in terms of worker management and better for exception handling.

Examples

You can find more examples of making use of local disk storage here: