Mastering Timestamping in Rails
Average Ruby Dev
UPD: 31/03/2024 I've opened a PR to add touch option to #update_columns and #update_column method
In modern web development, precise and efficient data management is crucial for making informed business decisions.
Data consistency, particularly in how database records are dated, plays a crucial role. Data engineers often rely on these dates not only for record-keeping but also as a method to download only the data that has changed, avoiding the need to process entire tables.
This guide addresses the challenge of mastering dating amid the inconsistencies found in Rails' handling of timestamps. It explores strategies for ensuring that timestamps are both accurate and dependable, offering valuable insights for developers looking to navigate these complexities. Whether you're new to the field or seeking solutions to specific dating challenges, this article provides practical tips and strategies to improve your data management practices.
TLDR
Not all ActiveRecord persistence methods affect timestamps or have a touch option. For methods like update_columns that don't automatically update timestamps, you can create a RuboCop custom cop, modify ActiveRecord directly, or use database triggers to keep updated_at always up-to-date.
Each method has its advantages and disadvantages, from how easy they are to manage to potential compatibility issues with future Rails updates or the risk of SQL operations skipping ActiveRecord methods.
If you're interested in this topic, join the Rails Discussion thread https://discuss.rubyonrails.org/t/proposal-add-touch-option-for-update-columns-update-column/85388
ActiveRecord timestamps configuration
ActiveRecord automatically timestamps create and update operations if the table has fields named created_at/created_on or updated_at/updated_on.
For turning off timestamping, add:
config.active_record.record_timestamps = false # true is default
Timestamps are in UTC by default, but you can use the local timezone by setting:
config.active_record.default_timezone = :local # is :utc by default
ActiveRecord keeps all the datetime and time columns timezone aware. By default, these values are stored in the database as UTC and converted back to the current Time.zone when pulled from the database.
This feature can be turned off completely by setting:
config.active_record.time_zone_aware_attributes = false # is true by default
ActiveRecord persistence methods and touching timestamps
There are a lot of persistence methods in ActiveRecord, but not all of them touch timestamps or have a touch option. You can find most of them in the table below.
| Module#method | updates timestamps if record_timestamps == true | has touch option |
| Persistence#save | yes | yes |
| Persistence#save! | yes | yes |
| Persistence#create | yes | no |
| Persistence#create! | yes | no |
| Persistence#update | yes | no |
| Persistence#update! | yes | no |
| Persistence#update_attribute | yes | no |
| Persistence#touch | yes | no |
| Persistence#increment! | no | yes |
| Persistence#update_column | no | no |
| Persistence#update_columns | no | no |
| Persistence#toggle! | yes | no |
| Persistence#insert | yes | yes, via record_timestamps keyword argument |
| Persistence#insert! | yes | yes, via record_timestamps keyword |
| Persistence#insert_all | yes | yes, via record_timestamps keyword |
| Persistence#upsert_all | yes | yes, via record_timestamps keyword |
| Relation#update_all | no | no |
| Relation#touch_all | yes | yes, via positional arguments |
| Relation#update_counters | no | yes |
As you can see, three methods don't update timestamps by default nor provide a touch option: update_column, update_columns, and update_all. Sometimes this may be a problem, i.e., there is some ETL processing that, instead of copying the whole table, looks into updated_at timestamps. So if someone uses update_columns because of performance reasons, it may lead to lost updates. However, there are a couple of methods to solve this problem.
Let's consider a pretty basic example.
class ApplicationController < ActionController::Base
before_action :update_last_user_ip
def update_last_user_ip
ip = request.remote_ip
return if current_user.last_ip != ip
# we don't perform any callbacks or validations here
# so use #update_columns
current_user.update_columns(
last_ip: request.remote_ip,
updated_at: Time.current # but we still want to keep track of the last changes, so have to provide timestamp explicitly
)
end
end
Here, we aim to update the user's last seen IP in their record without triggering any validations or callbacks. The simplest method for this is using the #update_columns method. However, to ensure the timestamp remains current, we must explicitly include updated_at. What issues might arise from this approach?
Several, including:
Remembering that
#update_columnsdoes not update timestamps, a behavior that is documented but might still catch you off guard.The need to explicitly set
updated_at/updated_on.The absence of a
touchoption, unlike what you find in methods like#increment!.The
record_timestampssetting does not affect timestamp behavior.
So, what can we do if we want to consistently update timestamps across the application? There are a few solutions.
Rubocop Cop
RuboCop lets you make your own custom cops. You need to make a new file for your custom cop, which we'll name UpdateColumnsCop. Put this file in a folder where RuboCop looks for custom cops. A usual spot for this is lib/rubocop/cop/.
Here's a simple setup for your custom cop:
# rubocop/cop/rails/update_columns_timestamps.rb
module RuboCop
module Cop
module Rails
class UpdateColumnsCop < RuboCop::Cop::Base
extend RuboCop::Cop::AutoCorrector
MSG = "Ensure `updated_at` or `updated_on` is updated when using `update_columns`"
def_node_matcher :update_columns?, <<-PATTERN
(send _ {:update_columns} ...)
PATTERN
def on_send(node)
return unless update_columns?(node)
# Check if `updated_at` or `updated_on` is being updated
updated_at_or_updated_on_updated = node.arguments.any? do |arg|
arg.hash_type? && arg.pairs.any? do |pair|
pair.key.value == :updated_at || pair.key.value == :updated_on
end
end
return if updated_at_or_updated_on_updated
add_offense(node, message: MSG) do |corrector|
corrector.insert_after(node.loc.selector, ", updated_at: Time.current")
end
end
end
end
end
end
To make RuboCop aware of your custom cop, you need to register it. Create a .rubocop.yml file in your project root if you don't already have one, and add the following configuration:
require:
- ./rubocop/cop/rails/update_columns_timestamps.rb
Rails/UpdateColumnsCop:
Enabled: true
It provides lint error in case of using update_columns without updated_at or update_on attribute:
app/controllers/application_controller.rb:10:5: C: [Correctable] Rails/UpdateColumnsCop: Ensure updated_at or updated_on is updated when using update_columns
current_user.update_columns( ...
Pros:
Identifies violations without changing behavior
Can be modified or ignored like a standard RuboCop cop
Cons:
It can't handle
update_columnbecause it doesn't offer an option for timestamps. This requires an additional rule that completely discourages the use ofupdate_columnin favor ofupdate_columns.It doesn't address every situation. For instance, using raw SQL might still bypass updating the
updated_atfield.
Monkey patch ActiveRecord update_column, update_columns
This method, inspired by Tim McCarthy's gist (with Unathi Chonco as an original author), includes a few modifications for safer patching and extra features.
Add an initializer for patches
# config/initializers/core_ext_require.rb # NOTE: Require all patches in lib/core_ext Dir[Rails.root.join("lib/core_ext/**/*.rb")].each { |f| require f }Add a patch for the
Persistencemodule# lib/core_ext/active_record/persistence/update_columns_patch.rb module CoreExt module ActiveRecord module Persistence module UpdateColumnsPatch # https://github.com/rails/rails/blob/36c1591bcb5e0ee3084759c7f42a706fe5bb7ca7/activerecord/lib/active_record/persistence.rb#L931-L954 def update_columns(attributes) touch = attributes.delete(:touch) { self.class.record_timestamps } if touch names = touch if touch != true names = Array.wrap(names) options = names.extract_options! touch_updates = self.class.touch_attributes_with_time(*names, **options) attributes.merge!(touch_updates) unless touch_updates.empty? end super(attributes) end # https://github.com/rails/rails/blob/36c1591bcb5e0ee3084759c7f42a706fe5bb7ca7/activerecord/lib/active_record/persistence.rb#L910-L913 def update_column(name, value, touch: true) update_columns(name => value, :touch => touch) end end end end end ActiveRecord::Persistence.prepend(CoreExt::ActiveRecord::Persistence::UpdateColumnsPatch)This patch mimics the behavior of the
#savemethod: it updates timestamps by default and introduces atouch:option to choose whether to skip the update. It also respects therecord_timestampssetting, both globally and at the model level. With this change, we can simplify our example as follows:class ApplicationController < ActionController::Base before_action :update_last_user_ip def update_last_user_ip ip = request.remote_ip return if current_user.last_ip != ip # we don't perform any callbacks or validations here # so use #update_columns current_user.update_columns(last_ip: request.remote_ip) end end#update_columnsdoes automatically update theupdated_atfield. However, if you need to avoid updating it for some reason, you can explicitly use thetouchoption:current_user.update_columns(last_ip: request.remote_ip, touch: false)Also, if attribute names are provided, they are updated together with the
updated_at/updated_onattributes, similar to how#update_countersworks.current_user.update_columns( last_ip: request.remote_ip, touch: :last_ip_updated_at )
Pros:
An ad-hoc solution that's easy to manage and adjust.
Behaves similarly to what we're used to with most methods in the
Persistencemodule.
Cons:
Involves monkey patching, which might break in future Rails updates.
Doesn't address all scenarios, for example, using raw SQL might still bypass updating the
updated_atfield.
If you think these changes are worth including in Rails, please join the Rails Discussion and leave a comment: https://discuss.rubyonrails.org/t/proposal-add-touch-option-for-update-columns-update-column/85388
Database triggers
If you need to update timestamps on each insert or update, even with raw SQL, you should use database triggers. Database triggers are pieces of procedural code that run in response to specific events in a database. For updating timestamps, this could be an UPDATE SQL statement.
First, we'll create the trigger function. This function is triggered whenever an update operation happens on a table. It will automatically update the updated_at column to the current timestamp.
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
This function, update_updated_at_column, is a simple PL/pgSQL function that sets the updated_at column of the new row (NEW) to the current timestamp (NOW()).
Next, you need to create a trigger for each table you want to track updates on. Here's how you can create a trigger for a specific table, let's say your_table_name:
CREATE TRIGGER update_your_table_name_trigger
BEFORE UPDATE ON your_table_name
FOR EACH ROW
EXECUTE FUNCTION update_updated_at_column();
This trigger, update_your_table_name_trigger, is set to execute before any update operation on your_table_name. It calls the update_updated_at_column function, which updates the updated_at column.
To handle both insert and update events for setting created_at and updated_at timestamps, you'll need to create two separate triggers for each event type. The first trigger will handle the insert event, setting both created_at and updated_at to the current timestamp. The second trigger will handle the update event, setting only the updated_at column to the current timestamp:
CREATE OR REPLACE FUNCTION update_created_updated_at_columns()
RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' THEN
NEW.created_at = NOW();
NEW.updated_at = NOW();
ELSIF TG_OP = 'UPDATE' THEN
NEW.updated_at = NOW();
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
This function checks the operation type (TG_OP) to determine if it's an insert or update operation. For inserts, it sets both created_at and updated_at to the current timestamp. For updates, it only updates the updated_at column.
Now, create the triggers for both insert and update events:
-- Trigger for INSERT
CREATE TRIGGER insert_your_table_name_trigger
BEFORE INSERT ON your_table_name
FOR EACH ROW
EXECUTE FUNCTION update_created_updated_at_columns();
-- Trigger for UPDATE
CREATE TRIGGER update_your_table_name_trigger
BEFORE UPDATE ON your_table_name
FOR EACH ROW
EXECUTE FUNCTION update_created_updated_at_columns();
For better triggers management within Rails it's recommended to use tools like fx or hair_trigger.
Pros:
- Always up-to-date
created_at/updated_attimestamps
Cons:
Triggers are difficult to manage
You need to add triggers for each new table where you want to keep the timestamps current
Triggers make the app behavior less obvious and, sometimes you might not want to update timestamps, and that removes control from the app