Avoid unexpected duplication in MySQL database

Screen Shot 2018-05-24 at 4.32.18 PM

Let’s assume that your application has to insert thousand records at the same time, and you might want those records not to be duplicated. For example, you need to perform insertion from a CSV file, to insert contacts, and you want there’s no two contacts with the same email.

You might come up with the solution like: “Ah, ActiveRecord supports validations!”. I think validation like

class User
  validates_uniqueness_of :email
end

I think the validation will work well in majority cases. However, let’s assume you’re doing that task in background job. Then you have another file, which has a lot of duplicated users, and new users, of course. You wanna do the second insertion right after that (or some actions like that at almost same time). Do you think that at the same moment, two jobs will insert the same users, and validations for those two records will both pass. Then? You have the duplicated users in your database.

It’s time to try the second solution. Let’s put the hard jobs on MySQL side. MySQL provides kind of index called UNIQUE index that allows you to enforce the uniqueness of values in one or more columns
You can use MYSQL command

CREATE UNIQUE INDEX idx_email
ON users(email);

You can ask MYSQL to have unique index on multiple columns

CREATE UNIQUE INDEX idx_email_phone
ON users(email, phone);

How can you do with migrations?

add_index :users, [:email, :phone], unique: true

Another thing that We should mention here, MySQL allow multiple rows with NULL even there is UNIQUE INDEX constraint.

Again, you can come up with two possible solutions.

MySQL side

change_column :users, :email, :string, null: false

Or use a validation

validates :email, presence: true

The last thing we should be aware of is that MySQL might raise an exception if there’s duplicated rows. Thus, we should have a rescue block

begin
  # Your code come here
rescue StandardError => e
  # You can write exception detail to a log file using logger
  log = Logger.new("log/mysql.log")
  log.info("#{Time.now.strftime('%Y-%m-%d %H:%M:%S')} - #{e.class}: #{e.message} \n #{email}")
end

That’s all for handling duplicated records
Happy coding!

Backup your Mysql database


It is not always that in a project you can use library as yaml_db or seed_fu. Sometimes you just want to move a database from this server to another server. And ofcourse, use mysqldump will be faster.

First, you my want to connect to remote server via ssh and private key:

sudo ssh -i path_to_your_key remote_user_name@remote_server_address

Dump database:

## Dump all tables of database
mysqldump -h database_host -u database_user_name -p database_name > my_db_dump.sql

## Dump some tables of database
mysqldump -h database_host -u database_user_name -p database_name table_1 table_2 table_3 > db_part_1_dump.sql

## Dump database ignore some
mysqldump -h database_host -u database_user_name -p database_name --ignore-table=my_database.table1 my_database.table2 > db_part_1_dump.sql

Restore database:

mysql -u db_user_name databse_name -p -h host_name < path_to_sql_dump_file

If you need to copy file from your remote server to local computer, use this

sudo scp -i path_to_private_ssh_key remote_user@remote_host:path_to_dump_db_file_on_remote path_to_where_you_want_to_store_on_local

## Or reverse way
sudo scp -i path_to_private_ssh_key path_to_where_you_want_to_store_on_local remote_user@remote_host:path_to_dump_db_file_on_remote