PHP+ZipArchive+RussianLanguage=Krakozyabry

Question:

Hello. Help solve the problem: there is a zip-archive , it is not known where and how it was created, you need to unpack all the files into a subfolder, and the files in the archive have names containing Russian letters. Tried like this :

$zip = new ZipArchive;
$zip->open($uploadfile);
$dir = $uploaddir . basename($uploadfile,".zip");
mkdir($dir);
$zip->extractTo($dir);
$zip->close();
#unlink($uploadfile);
$files = scandir($dir);
foreach ($files as $filename)
{
    print iconv('cp866', 'utf-8', $filename).PHP_EOL;
}

As a result, I see krakozyabry … I tried to simply rename the files with the command :

find . -type f -exec sh -c 'np=`echo {}|iconv -f cp1252 -t cp850| iconv -f cp866`; mv "{}" "$np"' \;

Krakozyabry again turns out … However, if you run it directly from the shell via ssh on the server:

unzip -d '308313---16945050_2016_-_2(31)' '308313---16945050_2016_-_2(31).zip'

then everything unpacks normally . Moreover, if I execute the same command through PHP, using the system, then again I get krakozyabry. Help, I can not understand what is the problem? Just in case, some information about the system obtained via ssh:

$ uname -a
Linux iait-server 4.4.0-53-generic #74-Ubuntu SMP Fri Dec 2 15:58:04 UTC 2016 i686 i686 i686 GNU/Linux
$ locale
LANG=ru_RU.UTF-8
LANGUAGE=ru_RU:ru
LC_CTYPE="ru_RU.UTF-8"
LC_NUMERIC="ru_RU.UTF-8"
LC_TIME="ru_RU.UTF-8"
LC_COLLATE="ru_RU.UTF-8"
LC_MONETARY="ru_RU.UTF-8"
LC_MESSAGES="ru_RU.UTF-8"
LC_PAPER="ru_RU.UTF-8"
LC_NAME="ru_RU.UTF-8"
LC_ADDRESS="ru_RU.UTF-8"
LC_TELEPHONE="ru_RU.UTF-8"
LC_MEASUREMENT="ru_RU.UTF-8"
LC_IDENTIFICATION="ru_RU.UTF-8"
LC_ALL=ru_RU.UTF-8

Answer:

Maybe this one will help someone like me. at first he tried a lot of different encodings in one stage. but when checking on some resource, despite the fact that php showed that here utf-8 showed cp437. and when I met this option I was already ready that it would help.

$name = iconv('UTF-8', 'cp437//IGNORE', $zip->getNameIndex($i));
$name = iconv('cp437', 'cp865//IGNORE', $name);
$name = iconv('cp866','UTF-8//IGNORE',$name);

for details, the source of the result is where the author himself, respected DieZeeL, provided the solution. I bring it here to increase the chance that seekers will find it and that its work will be useful not only to me. perhaps I wrote too much, but I wanted to explain that this is not just an option and note the efforts of the author. PHP How to encode filenames with Cyrillic when extracting from a zip archive?

Scroll to Top