Iterating Tables In Batches
Rails provides a method called in_batches
that can be used to iterate over
rows in batches. For example:
User.in_batches(of: 10) do |relation|
relation.update_all(updated_at: Time.now)
end
Unfortunately this method is implemented in a way that is not very efficient, both query and memory usage wise.
To work around this you can include the EachBatch
module into your models,
then use the each_batch
class method. For example:
class User < ActiveRecord::Base
include EachBatch
end
User.each_batch(of: 10) do |relation|
relation.update_all(updated_at: Time.now)
end
This will end up producing queries such as:
User Load (0.7ms) SELECT "users"."id" FROM "users" WHERE ("users"."id" >= 41654) ORDER BY "users"."id" ASC LIMIT 1 OFFSET 1000
(0.7ms) SELECT COUNT(*) FROM "users" WHERE ("users"."id" >= 41654) AND ("users"."id" < 42687)
The API of this method is similar to in_batches
, though it doesn't support
all of the arguments that in_batches
supports. You should always use
each_batch
unless you have a specific need for in_batches
.