Mastering Timestamping in Rails

💡
All provided info is actual as of the date of writing: 30.03.2024. So it's relative to Rails 7.1, 7.0, 6.1, and some other earlier versions. But this may change in future updates. Check the docs and the source code!

UPD: 31/03/2024 I've opened a PR to add touch option to #update_columns and #update_column method


In modern web development, precise and efficient data management is crucial for making informed business decisions.

Data consistency, particularly in how database records are dated, plays a crucial role. Data engineers often rely on these dates not only for record-keeping but also as a method to download only the data that has changed, avoiding the need to process entire tables.

This guide addresses the challenge of mastering dating amid the inconsistencies found in Rails' handling of timestamps. It explores strategies for ensuring that timestamps are both accurate and dependable, offering valuable insights for developers looking to navigate these complexities. Whether you're new to the field or seeking solutions to specific dating challenges, this article provides practical tips and strategies to improve your data management practices.

TLDR

Not all ActiveRecord persistence methods affect timestamps or have a touch option. For methods like update_columns that don't automatically update timestamps, you can create a RuboCop custom cop, modify ActiveRecord directly, or use database triggers to keep updated_at always up-to-date.

Each method has its advantages and disadvantages, from how easy they are to manage to potential compatibility issues with future Rails updates or the risk of SQL operations skipping ActiveRecord methods.

If you're interested in this topic, join the Rails Discussion thread https://discuss.rubyonrails.org/t/proposal-add-touch-option-for-update-columns-update-column/85388

ActiveRecord timestamps configuration

ActiveRecord automatically timestamps create and update operations if the table has fields named created_at/created_on or updated_at/updated_on.

For turning off timestamping, add:

config.active_record.record_timestamps = false # true is default

Timestamps are in UTC by default, but you can use the local timezone by setting:

config.active_record.default_timezone = :local # is :utc by default

ActiveRecord keeps all the datetime and time columns timezone aware. By default, these values are stored in the database as UTC and converted back to the current Time.zone when pulled from the database.

This feature can be turned off completely by setting:

config.active_record.time_zone_aware_attributes = false # is true by default

ActiveRecord persistence methods and touching timestamps

There are a lot of persistence methods in ActiveRecord, but not all of them touch timestamps or have a touch option. You can find most of them in the table below.

Module#methodupdates timestamps if record_timestamps == truehas touch option
Persistence#saveyesyes
Persistence#save!yesyes
Persistence#createyesno
Persistence#create!yesno
Persistence#updateyesno
Persistence#update!yesno
Persistence#update_attributeyesno
Persistence#touchyesno
Persistence#increment!noyes
Persistence#update_columnnono
Persistence#update_columnsnono
Persistence#toggle!yesno
Persistence#insertyesyes, via record_timestamps keyword argument
Persistence#insert!yesyes, via record_timestamps keyword
Persistence#insert_allyesyes, via record_timestamps keyword
Persistence#upsert_allyesyes, via record_timestamps keyword
Relation#update_allnono
Relation#touch_allyesyes, via positional arguments
Relation#update_countersnoyes

As you can see, three methods don't update timestamps by default nor provide a touch option: update_column, update_columns, and update_all. Sometimes this may be a problem, i.e., there is some ETL processing that, instead of copying the whole table, looks into updated_at timestamps. So if someone uses update_columns because of performance reasons, it may lead to lost updates. However, there are a couple of methods to solve this problem.

Let's consider a pretty basic example.

class ApplicationController < ActionController::Base
  before_action :update_last_user_ip

  def update_last_user_ip
    ip = request.remote_ip
    return if current_user.last_ip != ip
    # we don't perform any callbacks or validations here
    # so use #update_columns
    current_user.update_columns(
        last_ip: request.remote_ip,
        updated_at: Time.current # but we still want to keep track of the last changes, so have to provide timestamp explicitly
    )
  end
end

Here, we aim to update the user's last seen IP in their record without triggering any validations or callbacks. The simplest method for this is using the #update_columns method. However, to ensure the timestamp remains current, we must explicitly include updated_at. What issues might arise from this approach?

Several, including:

  • Remembering that #update_columns does not update timestamps, a behavior that is documented but might still catch you off guard.

  • The need to explicitly set updated_at/updated_on.

  • The absence of a touch option, unlike what you find in methods like #increment!.

  • The record_timestamps setting does not affect timestamp behavior.

So, what can we do if we want to consistently update timestamps across the application? There are a few solutions.

Rubocop Cop

RuboCop lets you make your own custom cops. You need to make a new file for your custom cop, which we'll name UpdateColumnsCop. Put this file in a folder where RuboCop looks for custom cops. A usual spot for this is lib/rubocop/cop/.

Here's a simple setup for your custom cop:

# rubocop/cop/rails/update_columns_timestamps.rb
module RuboCop
  module Cop
    module Rails
      class UpdateColumnsCop < RuboCop::Cop::Base
        extend RuboCop::Cop::AutoCorrector

        MSG = "Ensure `updated_at` or `updated_on` is updated when using `update_columns`"

        def_node_matcher :update_columns?, <<-PATTERN
           (send _ {:update_columns} ...)
        PATTERN

        def on_send(node)
          return unless update_columns?(node)

          # Check if `updated_at` or `updated_on` is being updated
          updated_at_or_updated_on_updated = node.arguments.any? do |arg|
            arg.hash_type? && arg.pairs.any? do |pair|
              pair.key.value == :updated_at || pair.key.value == :updated_on
            end
          end

          return if updated_at_or_updated_on_updated

          add_offense(node, message: MSG) do |corrector|
            corrector.insert_after(node.loc.selector, ", updated_at: Time.current")
          end
        end
      end
    end
  end
end

To make RuboCop aware of your custom cop, you need to register it. Create a .rubocop.yml file in your project root if you don't already have one, and add the following configuration:

require:
 - ./rubocop/cop/rails/update_columns_timestamps.rb

Rails/UpdateColumnsCop:
 Enabled: true

It provides lint error in case of using update_columns without updated_at or update_on attribute:

app/controllers/application_controller.rb:10:5: C: [Correctable] Rails/UpdateColumnsCop: Ensure updated_at or updated_on is updated when using update_columns
    current_user.update_columns( ...

Pros:

  • Identifies violations without changing behavior

  • Can be modified or ignored like a standard RuboCop cop

Cons:

  • It can't handle update_column because it doesn't offer an option for timestamps. This requires an additional rule that completely discourages the use of update_column in favor of update_columns.

  • It doesn't address every situation. For instance, using raw SQL might still bypass updating the updated_at field.

Monkey patch ActiveRecord update_column, update_columns

This method, inspired by Tim McCarthy's gist (with Unathi Chonco as an original author), includes a few modifications for safer patching and extra features.

  • Add an initializer for patches

  •       # config/initializers/core_ext_require.rb
          # NOTE: Require all patches in lib/core_ext
          Dir[Rails.root.join("lib/core_ext/**/*.rb")].each { |f| require f }
    
  • Add a patch for the Persistence module

      # lib/core_ext/active_record/persistence/update_columns_patch.rb
      module CoreExt
        module ActiveRecord
          module Persistence
            module UpdateColumnsPatch
              # https://github.com/rails/rails/blob/36c1591bcb5e0ee3084759c7f42a706fe5bb7ca7/activerecord/lib/active_record/persistence.rb#L931-L954
              def update_columns(attributes)
                touch = attributes.delete(:touch) { self.class.record_timestamps }
                if touch
                  names = touch if touch != true
                  names = Array.wrap(names)
                  options = names.extract_options!
                  touch_updates = self.class.touch_attributes_with_time(*names, **options)
                  attributes.merge!(touch_updates) unless touch_updates.empty?
                end
                super(attributes)
              end
    
              # https://github.com/rails/rails/blob/36c1591bcb5e0ee3084759c7f42a706fe5bb7ca7/activerecord/lib/active_record/persistence.rb#L910-L913
              def update_column(name, value, touch: true)
                update_columns(name => value, :touch => touch)
              end
            end
          end
        end
      end
    
      ActiveRecord::Persistence.prepend(CoreExt::ActiveRecord::Persistence::UpdateColumnsPatch)
    

    This patch mimics the behavior of the #save method: it updates timestamps by default and introduces a touch: option to choose whether to skip the update. It also respects the record_timestamps setting, both globally and at the model level. With this change, we can simplify our example as follows:

      class ApplicationController < ActionController::Base
        before_action :update_last_user_ip
    
        def update_last_user_ip
          ip = request.remote_ip
          return if current_user.last_ip != ip
          # we don't perform any callbacks or validations here
          # so use #update_columns
          current_user.update_columns(last_ip: request.remote_ip)
        end
      end
    

    #update_columns does automatically update the updated_at field. However, if you need to avoid updating it for some reason, you can explicitly use the touch option:

      current_user.update_columns(last_ip: request.remote_ip, touch: false)
    

    Also, if attribute names are provided, they are updated together with the updated_at/updated_on attributes, similar to how #update_counters works.

      current_user.update_columns(
        last_ip: request.remote_ip,
        touch: :last_ip_updated_at
      )
    

Pros:

  • An ad-hoc solution that's easy to manage and adjust.

  • Behaves similarly to what we're used to with most methods in the Persistence module.

Cons:

  • Involves monkey patching, which might break in future Rails updates.

  • Doesn't address all scenarios, for example, using raw SQL might still bypass updating the updated_at field.

If you think these changes are worth including in Rails, please join the Rails Discussion and leave a comment: https://discuss.rubyonrails.org/t/proposal-add-touch-option-for-update-columns-update-column/85388

Database triggers

If you need to update timestamps on each insert or update, even with raw SQL, you should use database triggers. Database triggers are pieces of procedural code that run in response to specific events in a database. For updating timestamps, this could be an UPDATE SQL statement.

First, we'll create the trigger function. This function is triggered whenever an update operation happens on a table. It will automatically update the updated_at column to the current timestamp.

CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

This function, update_updated_at_column, is a simple PL/pgSQL function that sets the updated_at column of the new row (NEW) to the current timestamp (NOW()).

Next, you need to create a trigger for each table you want to track updates on. Here's how you can create a trigger for a specific table, let's say your_table_name:

CREATE TRIGGER update_your_table_name_trigger
BEFORE UPDATE ON your_table_name
FOR EACH ROW
EXECUTE FUNCTION update_updated_at_column();

This trigger, update_your_table_name_trigger, is set to execute before any update operation on your_table_name. It calls the update_updated_at_column function, which updates the updated_at column.

To handle both insert and update events for setting created_at and updated_at timestamps, you'll need to create two separate triggers for each event type. The first trigger will handle the insert event, setting both created_at and updated_at to the current timestamp. The second trigger will handle the update event, setting only the updated_at column to the current timestamp:

CREATE OR REPLACE FUNCTION update_created_updated_at_columns()
RETURNS TRIGGER AS $$
BEGIN
    IF TG_OP = 'INSERT' THEN
        NEW.created_at = NOW();
        NEW.updated_at = NOW();
    ELSIF TG_OP = 'UPDATE' THEN
        NEW.updated_at = NOW();
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

This function checks the operation type (TG_OP) to determine if it's an insert or update operation. For inserts, it sets both created_at and updated_at to the current timestamp. For updates, it only updates the updated_at column.

Now, create the triggers for both insert and update events:

-- Trigger for INSERT
CREATE TRIGGER insert_your_table_name_trigger
BEFORE INSERT ON your_table_name
FOR EACH ROW
EXECUTE FUNCTION update_created_updated_at_columns();

-- Trigger for UPDATE
CREATE TRIGGER update_your_table_name_trigger
BEFORE UPDATE ON your_table_name
FOR EACH ROW
EXECUTE FUNCTION update_created_updated_at_columns();

For better triggers management within Rails it's recommended to use tools like fx or hair_trigger.

Pros:

  • Always up-to-date created_at/updated_at timestamps

Cons:

  • Triggers are difficult to manage

  • You need to add triggers for each new table where you want to keep the timestamps current

  • Triggers make the app behavior less obvious and, sometimes you might not want to update timestamps, and that removes control from the app

Sources: