CSV UTF-8 encoding issues in Excel

Oct 9, 2015

This frustrating csv UTF-8 encoding issue, most of the times, occurs after using Microsoft Excel for your CSV file, and that is due to a bug which causes some accented characters to become corrupted before / during / after import or just document edit (depending on when you use Microsoft Excel) – this is a known Microsoft bug which has never been fixed. The solution is to either install and use OpenOffice or Google Docs’ Spreadsheet function for smaller files, load the document and force save as a CSV file with the right encoding.

Step-by-step for OpenOffice

  1. Open your source data in Open Office (Spreadsheet), if it’s in CSV format (text, comma delimited) you will be prompted to select character encoding. By default, it’s set to ‘Western European’. Check the data in that window to see if accented characters look OK, if they do… jump to step 3.
  2. If the characters do look corrupted (unexpected capitals or weird symbols in place of lower case accented characters, etc) you will need to change the character encoding. This will usually just be a change to UTF-8 (not UTF-7) – but check the accented characters!
  3. Click OK and import the data, it should open and look like a normal spreadsheet.
  4. Now proceed to save it in a correctly encoded format by choosing to Save As (Ctrl + Shift + S) and choosing Text CSV (.csv) _from the _formats drop down, just above the save button, and sticking to _“Use Text CSV Format” _in the following prompt box.
  5. Done!

This csv UTF-8 encoding issue seems to be fairly common where data imports / exports are prevalent between SaaS systems from different suppliers.