Skip to main content
GrN.dk

Main navigation

  • Articles
  • Contact
  • Your Digital Project Manager
  • About Greg Nowak
  • Services
  • Portfolio
  • Container
    • Excel Freelancer
    • Kubuntu - tips and tricks
    • Linux Apache MySQL and PHP
    • News
    • Image Gallery
User account menu
  • Log in

Breadcrumb

  1. Home

Mysqldump Encoding: Avoid Broken Characters in Database Exports

By Greg Nowak. Last updated 2026-07-02.

Broken characters in a MySQL export are usually not caused by the text editor, the shell, or the file extension. They normally appear when the dump connection reads data with one character set while the schema, columns, or restore process expect another. For a business migration, agency handoff, or staging refresh, that small mismatch can show up as damaged names, product titles, addresses, email templates, and CMS content.

The practical fix is not to guess. Check how the source database is defined, dump with the right client character set, keep the dump self-describing where possible, and run a test restore before anyone treats the file as delivery-ready.

Confirm what the database actually uses

Current MySQL documentation recommends utf8mb4 wherever possible and lists it as the default server character set. Real production systems are less tidy. Older applications may still contain latin1, utf8mb3, or a mix of database defaults and column-level exceptions.

Start with these checks before exporting:

SHOW VARIABLES LIKE 'character_set_%';
SHOW VARIABLES LIKE 'collation_%';

SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME
FROM INFORMATION_SCHEMA.SCHEMATA
WHERE SCHEMA_NAME = 'DATABASE_NAME';

SELECT TABLE_NAME, COLUMN_NAME, CHARACTER_SET_NAME, COLLATION_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = 'DATABASE_NAME'
  AND CHARACTER_SET_NAME IS NOT NULL
ORDER BY TABLE_NAME, ORDINAL_POSITION;

The first queries show the server and connection settings. The INFORMATION_SCHEMA queries show what the schema and text columns declare. If the database default is utf8mb4 but a legacy customer table still has latin1 columns, document that before you export. A clean-looking dump command will not explain that history to the next team.

What you find Likely export choice Handoff check
All text columns use utf8mb4 Use normal mysqldump behavior or set utf8mb4 explicitly Test accents, emoji, and multilingual content after restore
Schema or columns use latin1 Use latin1 only when metadata confirms it Check whether the app stored valid latin1 or misencoded UTF-8 bytes
Mixed charsets across old tables Export carefully, table by table if needed Map risky fields and agree whether conversion is in scope
Text is already broken in the live app A dump will preserve the damage Plan data repair separately from backup and migration
A quick decision matrix for choosing the least risky mysqldump charset path.

Use utf8mb4 as the normal modern path

For most current MySQL projects, the safest starting point is a self-describing dump using utf8mb4. MySQL documents mysqldump as using utf8mb4 when no character set is specified, and it writes charset statements by default so the restore side does not have to guess.

mysqldump -u USER -p --single-transaction DATABASE_NAME > dump.sql

If your team prefers commands that make intent obvious in a runbook, be explicit:

mysqldump -u USER -p --single-transaction --default-character-set=utf8mb4 DATABASE_NAME > dump.sql

--single-transaction is useful for many InnoDB-backed systems because it takes a consistent snapshot without locking every table for the duration of the dump. It does not solve encoding problems by itself, but it does reduce operational risk during a live export. Also, there is usually no need to add --opt; MySQL enables that option group by default, and it includes --set-charset.

When latin1 is still the right answer

Legacy systems sometimes really are latin1. If the metadata confirms that, forcing the dump connection to latin1 can be correct:

mysqldump -u USER -p --single-transaction --default-character-set=latin1 DATABASE_NAME > dump-latin1.sql

Be cautious with --skip-set-charset. That option removes the SET NAMES statement from the output. It can make sense inside a tightly controlled restore pipeline, but it makes the file easier to misuse when another agency, host, or client team receives it. If you deliberately omit the charset statement, restore with the matching client option:

mysql --default-character-set=latin1 DATABASE_NAME < dump-latin1.sql

Avoid the traps that create cleanup work

Do not use utf8 as a casual shortcut. In MySQL, utf8 is a deprecated alias for utf8mb3, not full utf8mb4. Do not assume the schema default describes every column. Do not expect a dump to repair data that was already stored with the wrong bytes. And do not ship an export without a test import if the content matters to a launch, billing process, client portal, or CMS migration.

One extra operational note: MySQL documents that PowerShell redirection on Windows can create a UTF-16 dump file, which cannot be loaded correctly as a MySQL connection character set. If that environment is involved, use --result-file=dump.sql instead of plain shell redirection.

Make the migration boring on purpose

A useful handover includes the command used, the source charset findings, the MySQL version, and a short note on test records checked after restore. Pick records with accented names, currency symbols, smart quotes, emoji if the app supports them, and non-English text if the business uses it.

The best database exports are verified, repeatable, and documented. For teams under deadline pressure, the temptation is to grab a dump and fix problems later. That is usually more expensive than spending a few minutes confirming metadata and doing one throwaway restore.

If you are planning a MySQL migration, staging refresh, or client handoff, Greg can help shape the export into a controlled delivery plan before the risky part starts.

Related on GrN.dk

  • NGINX 1.30 changed upstream connection reuse by default: what to check before you upgrade
  • Importing External Data into Drupal: A Practical Migration Plan
  • Microsoft Access Database Resources: Practical Help for Small Business Databases

Need help with this kind of work?

Plan a safer database migration Get in touch with Greg.

Sources

  • MySQL 9.7 Reference Manual: mysqldump — A Database Backup Program
  • MySQL 9.7 Reference Manual: Character Sets, Collations, Unicode
  • MySQL 9.7 Reference Manual: mysql Client Options
  • MySQL 9.7 Reference Manual: The INFORMATION_SCHEMA SCHEMATA Table
  • MySQL 9.7 Reference Manual: The INFORMATION_SCHEMA COLUMNS Table
Last modified
2026-07-02

Tags

  • mysql
  • Linux
  • database
  • migration

Review Greg on Google

Greg Nowak Google Reviews

 

  • Google’s 2026 AI Search Guidance: SEO Still Counts, Reporting Changes
  • Drupal Wiki: Build a Knowledge Base People Can Actually Use
  • Mysqldump Encoding: Avoid Broken Characters in Database Exports
  • Drupal 8 Advanced Aggregation: Better Google PageSpeed Scores Without the Guesswork
  • ChatGPT apps need a permissions map before they touch company data
RSS feed

GrN.dk web platforms, web optimization, data analysis, data handling and logistics.