From the console the file command can be used to describe a file (with the encoding it is saved as)
user@comp:~/workspace/project$ file dbase/init.sql dbase/init.sql: UTF-8 Unicode (with BOM) English text |
From the console the file command can be used to describe a file (with the encoding it is saved as)
user@comp:~/workspace/project$ file dbase/init.sql dbase/init.sql: UTF-8 Unicode (with BOM) English text |
Setting a default charset using .htaccess is a simple thing to do
However sometimes there is a need to unset default charset as well (for instance if the AddDefaultCharset is set in the apache config and this is causing problems).
This is very easy to do just edit the .htaccess file and add a line that says
AddDefaultCharset OFF |
To get a html page to display UTF-8 encoded text correctly (without setting a default charset) simply add the following to the head of the html page.
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> |
If your store uses DIBS to manage payments, then the risk is that you have to worry about characters like åäö.
What we had to do is to make the form post the data as ISO-8859-1.
We did this by adding accept-charset=”iso-8859-1″ to the forms that we send to DIBS.
In “/app/design/frontend/default/blank/template/dibs/standard/redirect_paymentwindow.phtml”
56 | <form action="https://payment.architrade.com/paymentweb/start.action" method="post" name="dibs" id="dibs" accept-charset="iso-8859-1"> |
Once we had done this then the order details were shown correctly on the DIBS page (yes, the Chinese chars are just a test, I have no idea what it might say)
If you have a file that is saves as ISO-8859-1 (or ISO-LATIN-1 if you like to call it that) and wish to convert it to UTF-8 you can use:
iconv --from-code=ISO-8859-1 --to-code=UTF-8 ./oldfile.htm > ./newfile.html |
This will create a new file with the converted encoding.
iconv can of of course convert to and from several other charsets. To see a list of all the encodings that iconv can work with use:
iconv -l |
If you wish to massconvert files find can be used with exec
find . -name "*.txt" -exec iconv -f ISO-8859-1 -t UTF-8 {} -o {}.utf8 \; |
If you do not wish to change character encoding for the entire server, but only one site (or only one directory) then this is possible to do using .htaccess.
AddDefaultCharset UTF-8 |
If you only wish to do this to php or htm files (and not all files)
<FilesMatch "\.(htm|php)$"> AddDefaultCharset UTF-8 </FilesMatch> |
If you wish to modify the mime-type as well as the encoding (on html files in this example)
AddType 'text/html; charset=UTF-8' html |