Today Icelab Learned
 about ruby

Rake tasks with splat arguments

Splat arguments can be helpful, and it turns out we don’t need to forsake them when we’re building Rake tasks, either.

Let’s say we want to build an elasticsearch:reindex task that allows an optional list of entity types to reindex:

namespace :elasticsearch do
  task :reindex do |_t, args|
    Search::Container["search.operations.enqueue_reindex"].(*args.extras)
  end
end

Rake gives us a TaskArguments#extras method, which returns the values not associated with any of the task’s named arguments.

So now we can call our rake task like so:

rake elasticsearch:reindex[products, posts]

And args.extras gives us this:

["products", "posts"]

Kwarg splat arguments create new hashes

I made a method with a “double splat” kwarg argument to accept some options and returned a modified hash:

def singularize_options(**options)
  options = options.dup

  # mutate `options` here...

  options
end

In my usual “don’t mutate things we don’t own” style, I #duped the incoming options before going to work on it.

However, thanks to our dry-rb/rom-rb friend @flash-gordon, I learnt this wasn’t necessary! As he says:

when you capture values with **, Ruby creates a new hash instance so calling .dup is not needed

def foo(**options)
  options.object_id
end

{}.tap { |h| puts foo(h) == h.object_id }
# => false

Now my method can be even simpler:

def singularize_options(**options)
  # mutate `options` here...
  options
end

Specify enumerator index offset

When looping through a collection of items, we sometimes want access to the index of an item at each iteration.

We can do that with collection.each_with_index{ |item, index| ... }.

In this case, index will always start at 0.

If we want index to start from a particular number, we can use Enumerator#with_index and specify an offset.

For example: collection.each.with_index(1){ |item, index| ... }.

Enumerator#with_index

URI regular expressions

Let’s go ahead and write some regex to validate a URI in Ruby:

Good Grief

URI.regexp

That’s it!

We can use it like so

URI.regexp.match(my_uri)

If you have a look at the output of URI.regexp in irb, you’ll see the pattern that’s used to match against. It’s relatively complex, but each capture group is documented.

  1. Scheme
  2. Opaque (e.g. scheme:foo/bar)
  3. User Info
  4. Host
  5. Port
  6. Registry
  7. Path
  8. Query
  9. Fragment
irb(main):001:0> URI.regexp.match("http://username:password@foo.bar:80/baz.html?query=string#fragment")
=> #<MatchData "http://username:password@foo.bar:80/baz.html?query=string#frag" 1:"http" 2:nil 3:"username:password" 4:"foo.bar" 5:"80" 6:nil 7:"/baz.html" 8:"query=string" 9:"frag">

Hash.new with a block works well as a keyed cache

We’re all familiar with the @foo ||= expensive_computation technique to memoize (i.e. cache) the output of slow computations or to avoid unnecessary object creation.

If you want to do the same but with results that will vary by a single parameter, you can use Ruby’s hash with its block-based initializer to handle the caching for you:

def my_cache
  @my_cache ||= Hash.new do |hash, key|
    hash[key] = some_slow_computation(key)
  end
end

# Computed once
my_cache[:foo]

# Then cached
my_cache[:foo]

You can see this work in practice inside dry-component’s Injector, where it caches objects to allow arbitrarily long chaining of injector strategies without creating duplicate injector objects.

Symbolize keys

If you’re writing plain old Ruby you can symbolize the keys of a hash with the following:

hash.inject({}){|memo,(k,v)| memo[k.to_sym] = v; memo}

This does the following:

hash = {"hello" => "jojo"}
hash.inject({}){|memo,(k,v)| memo[k.to_sym] = v; memo}
=> {:hello=>"jojo"}

However if you’re doing this in a web application you may want to make a module to make hash and array data transformations easily available to you. In most of our apps we would use transproc to help us with this — if you’re working in one of Icelab’s Rodakase apps this will be available to you.

require "transproc/all"

module Functions
  extend Transproc::Registry

  import Transproc::HashTransformations
  import Transproc::ArrayTransformations

  def self.t(*args)
    self[*args]
  end
end

Functions.t(:symbolize_keys)[{"foo" => "bar"}]
# => {foo: "bar"}

If you are using Rails then you can use the method hash.symbolize_keys or the destructive version hash.symbolize_keys!
(both made available via ActiveSupport).

Pry is amazing

Pry is amazing! It has a whole bunch of helpful shortcuts you can use while working in a session.

I found the exception handling shortcuts particularly helpful. _ex_ will give you the last raised exception, and wtf? will show you a stacktrace from that exception (which is helpful, since stacktraces aren’t normally shown in interactive terminal sessions like this). Hilariously, you can add more question marks or exclamation marks on the end to see more detail.

If you work with Ruby app consoles regularly, you’d do yourself a favour to give the Pry Wiki a good read!

Authenticating with AWS Elasticsearch

The AWS Elasticseach service offers authentication via an IAM user, or by whitelisting IPs.

Here’s how to use IAM credentials to sign requests to the service when using Faraday and how to hook that into the Ruby elasticsearch gem.

To sign requests using Faraday, you can use a gem called faraday_middleware-aws-signers-v4, which provides a middleware that will sign your requests.

require 'faraday_middleware'
require 'faraday_middleware/aws_signers_v4'

conn = Faraday.new(url: 'address-of-your-AWS-es-service') do |faraday|
  faraday.request :aws_signers_v4, {
    credentials: Aws::Credentials.new(ENV['AWS_ACCESS_KEY_ID'], ENV['AWS_SECRET_ACCESS_KEY']),
    service_name: 'es',
    region: 'ap-southeast-2'
  }

  faraday.adapter :typhoeus
end

To get the client provided by the elasticsearch gem to use your Faraday configuation, you can pass that configuration to it like so:

faraday_config = lambda do |faraday|
  faraday.request :aws_signers_v4, {
    credentials: Aws::Credentials.new(
      ENV["ELASTICSEARCH_AWS_ACCESS_KEY_ID"],
      ENV["ELASTICSEARCH_AWS_SECRET_ACCESS_KEY"]
      ),
      service_name: "es",
      region: ENV["ELASTICSEARCH_AWS_REGION"]
    }
    faraday.adapter :typhoeus
  end

elasticsearch_host_config = {
  host:   ENV["ELASTICSEARCH_HOST"],
  port:   ENV["ELASTICSEARCH_PORT"],
  scheme: ENV["ELASTICSEARCH_SCHEME"]
}

transport = Elasticsearch::Transport::Transport::HTTP::Faraday.new(hosts: [elasticsearch_host_config], &faraday_config)

client = Elasticsearch::Client.new(transport: transport)

You can then use the client object as usual, and you’ll get automatically signed requests.

Manually incrementing count columns in rails

Adding a counter cache column to a model is a common optimisation we make in order to avoid unnecessary queries when trying to aggregate data associated with that particular model. Rails provides us with a number of ways to maintain the counter cache column’s value. The first is to follow the rails convention and add counter_cache: true to a belongs_to association and ensure we have a correctly named *_count column.

The other way to do it, is manually. In this case rails provides us with a few convenience methods to increment a given column.

The first is ActiveRecord::Base#increment!(attribute, by).

increment! is defined as:

def increment!(attribute, by = 1)
  increment(attribute, by).update_attribute(attribute, self[attribute])
end

and increment is defined as:

def increment(attribute, by = 1)
  self[attribute] ||= 0
  self[attribute] += by
  self
end

Which means that we’re first fetching the current attribute’s value, incrementing it then passing it on to update_attribute before it can be saved. This method leads to a non-atomic database operation, that is to say that at one point, the count is different in memory than it is in the database (which can lead to race conditions).

The second is ActiveRecord::Base#increment_counter(column_name, record_id)

increment_counter is defined as:

def increment_counter(counter_name, id)
  update_counters(id, counter_name => 1)
end

which executes SQL like:

UPDATE "table_name"
  SET "counter_name" = "counter_name" + 1
  WHERE id = 1

This means that we now have an atomic operation and the counter cache value is the same across the system.

Docs:

Displaying booleans in Active Admin

In Active Admin if you want to display a boolean property that doesn’t directly map to a database column you can use status_tag to display the value in a friendly way:

column :featured do |thing|
  thing.featured? ? status_tag("yes", :ok) : status_tag("no")
end

There’s a bit more info in the Active Admin docs which shows you how to add classes too!

Formatting strings with C-like formatting codes.

I learnt this while putting together the report emails for Ticketscout.

Most of the time, regular string interpolation is all that we need when working with strings. Though sometimes we want a little more control over the formatting.

Let’s say we have a set of numbers that we want to calculate the average for, then print that average to 1 decimal place.

FLOATS = [
  42.0,
  36.0,
  28.0,
  19.0,
  27.0,
  18.0,
  10.0
]
avg = FLOATS.reduce(0, :+) / FLOATS.size
"Average is: #{avg}" # => "Average is: 25.714285714285715
"Average is: #{avg.round(1)}" # => "Average is: 25.7

Normally, this is fine, but when working with some content, it may be nicer to use Ruby’s String Format instead (especially if trying to format more complex numbers).

"Average is %.1f" % avg # => "Average is 25.7"

We can also make printing hashes a little prettier as well.

dog = {name: 'poppy',breed: 'labrador',colour: 'black'}

"This is #{dog[:name]}, she is a #{dog[:colour]} #{dog[:breed]}."

Becomes:

"This is %{name}, she is a %{colour} %{breed}." % [dog]

Which, to me at least, reads a little nicer.

Kernel::sprintf has a more complete list of formatting strings.

Dynamic classnames in Slim

While glancing through one of Narinda’s pull-requests, I noticed she’d used a syntax for dynamic classnames in Slim that I had not seen before:

/ Aww yeah
- dynamic_classname = (truthy_test ? 'foo' : 'bar')
.static-class-name-one class=["static-class-name-two", dynamic_classname]
  ' Foo or bar?

Slim will magically convert concatenate the classnames based on the truthiness of the truthy_test. Much more flexible (and less stinky) than the string concatenation I would usually use:

/ Eww
- dynamic_classname = (truthy_test ? 'foo' : 'bar')
.static-class-name-one class="static-class-name-two #{dynamic_classname}"
  ' Foo or bar?