Let’s assume that your application has to insert thousand records at the same time, and you might want those records not to be duplicated. For example, you need to perform insertion from a CSV file, to insert contacts, and you want there’s no two contacts with the same email.
You might come up with the solution like: “Ah, ActiveRecord supports validations!”. I think validation like
class User validates_uniqueness_of :email end
I think the validation will work well in majority cases. However, let’s assume you’re doing that task in background job. Then you have another file, which has a lot of duplicated users, and new users, of course. You wanna do the second insertion right after that (or some actions like that at almost same time). Do you think that at the same moment, two jobs will insert the same users, and validations for those two records will both pass. Then? You have the duplicated users in your database.
It’s time to try the second solution. Let’s put the hard jobs on MySQL side. MySQL provides kind of index called UNIQUE index that allows you to enforce the uniqueness of values in one or more columns
You can use MYSQL command
CREATE UNIQUE INDEX idx_email ON users(email);
You can ask MYSQL to have unique index on multiple columns
CREATE UNIQUE INDEX idx_email_phone ON users(email, phone);
How can you do with migrations?
add_index :users, [:email, :phone], unique: true
Another thing that We should mention here, MySQL allow multiple rows with NULL even there is UNIQUE INDEX constraint.
Again, you can come up with two possible solutions.
MySQL side
change_column :users, :email, :string, null: false
Or use a validation
validates :email, presence: true
The last thing we should be aware of is that MySQL might raise an exception if there’s duplicated rows. Thus, we should have a rescue block
begin # Your code come here rescue StandardError => e # You can write exception detail to a log file using logger log = Logger.new("log/mysql.log") log.info("#{Time.now.strftime('%Y-%m-%d %H:%M:%S')} - #{e.class}: #{e.message} \n #{email}") end
That’s all for handling duplicated records
Happy coding!