MySql – Colin Dao

Screen Shot 2018-05-24 at 4.32.18 PM

Let’s assume that your application has to insert thousand records at the same time, and you might want those records not to be duplicated. For example, you need to perform insertion from a CSV file, to insert contacts, and you want there’s no two contacts with the same email.

You might come up with the solution like: “Ah, ActiveRecord supports validations!”. I think validation like

class User
  validates_uniqueness_of :email
end

I think the validation will work well in majority cases. However, let’s assume you’re doing that task in background job. Then you have another file, which has a lot of duplicated users, and new users, of course. You wanna do the second insertion right after that (or some actions like that at almost same time). Do you think that at the same moment, two jobs will insert the same users, and validations for those two records will both pass. Then? You have the duplicated users in your database.

It’s time to try the second solution. Let’s put the hard jobs on MySQL side. MySQL provides kind of index called UNIQUE index that allows you to enforce the uniqueness of values in one or more columns
You can use MYSQL command

CREATE UNIQUE INDEX idx_email
ON users(email);

You can ask MYSQL to have unique index on multiple columns

CREATE UNIQUE INDEX idx_email_phone
ON users(email, phone);

How can you do with migrations?

add_index :users, [:email, :phone], unique: true

Another thing that We should mention here, MySQL allow multiple rows with NULL even there is UNIQUE INDEX constraint.

Again, you can come up with two possible solutions.

MySQL side

change_column :users, :email, :string, null: false

Or use a validation

validates :email, presence: true

The last thing we should be aware of is that MySQL might raise an exception if there’s duplicated rows. Thus, we should have a rescue block

begin
  # Your code come here
rescue StandardError => e
  # You can write exception detail to a log file using logger
  log = Logger.new("log/mysql.log")
  log.info("#{Time.now.strftime('%Y-%m-%d %H:%M:%S')} - #{e.class}: #{e.message} \n #{email}")
end

That’s all for handling duplicated records
Happy coding!

## Dump all tables of database mysqldump -h database_host -u database_user_name -p database_name > my_db_dump.sql ## Dump some tables of database mysqldump -h database_host -u database_user_name -p database_name table_1 table_2 table_3 > db_part_1_dump.sql ## Dump database ignore some mysqldump -h database_host -u database_user_name -p database_name --ignore-table=my_database.table1 my_database.table2 > db_part_1_dump.sql

sudo scp -i path_to_private_ssh_key remote_user@remote_host:path_to_dump_db_file_on_remote path_to_where_you_want_to_store_on_local ## Or reverse way sudo scp -i path_to_private_ssh_key path_to_where_you_want_to_store_on_local remote_user@remote_host:path_to_dump_db_file_on_remote

Tag: MySql

Avoid unexpected duplication in MySQL database

Backup your Mysql database