FileForums

FileForums (https://fileforums.com/index.php)
-   Conversion Tutorials (https://fileforums.com/forumdisplay.php?f=55)
-   -   LZ4 MOD for *certain* game's data archives | repacks (https://fileforums.com/showthread.php?t=99980)

elit 08-01-2018 14:04

LZ4 MOD for *certain* game's data archives | repacks
 
5 Attachment(s)
This is a modified LZ4 compressor for repacking game files before their injection back into their original(or zeroed) data archives. This will very likely only work on certain small selection of games, specifically Raiders of the Broken Planes - Wardog Fury, for which it was tested, and probably other Mercury Steam games that use LZ4 data archives(not all of them use lz4), like other Raiders games and perhaps 2 Castlevania games as well. This will compress files originally extracted(and unpacked) with castlevania.bms using quickbms back into their exact crc perfect compressed dump. For re-injecting, the script need to be modified to raw dump using log command(everything included below). Data packs will/must be crc exact as original. Compressed files will consist of only compressed blocks, with block size and crc preceding it, and without LZ4 header and tail. Also frame crc must be disabled. Command line is therefore "lz4 -9 -B5 --no-frame-crc FILE". -9 and -B5 were for this game. Do not use -B262144 as it is not same for some reason. Also read further for more details. It may need to be re-moded again for different game/version.

The story:
LZ4 is a whore. I meant it, I almost lost my mind. It took me weeks to figure out everything. But I had to do it, this game was perfect opportunity for me to learn, to get closer to that "FitGirl" status. Bms script was simple but at the same time challenges that arose made me learn so much more... I am glad.

First I had to figure out parameters. (Actually, first I had to learn basic bms scripting but I will skip that.) This wasnt so hard as from the bms script I already knew game compress with 256k blocks(0x040000). LZ4 specification state that -B5 == 256kb blocks. I compressed random extracted file with every option and compared to original raw(not decompressed) file dump.
<<<(For raw dump to extract, you just comment out "comtype" lines as well as chunk extract function and use classic log command - or same moded bms file as for reinjecting(see in attached files). Also this method make quickbms pretty much same as Razor's Injector Maker but quicker and no file size/memory limit.). >>>
I found compression -9 was closest to match most bytes in between files, it was clear this was correct setting. But files were not same(not taking head/tail and block's crc position into account).

I ended up downloading *every single* LZ4 version from github. I learned hard way that each few versions compression changed(different output), this was hell. I even found that each compression level is better or worse for different files, sometimes highest one gave worse results, sometimes level 10 gave worse than level 9 but level 11 again better, then on next file scenario switched etc. Then version differences in compression on top of that. I found that files had to be crc exact because game crashed, probably because it kept frame crc in other place of archive, but funny it crashed if 2 smaller game packs were moded but other 2 biggest ones not. And so on, my god...

Finally I found that v128-v131 were exact match(after removing header/tail and... oh yes, block crc - I will explain later). Yet, from say 30000+ files all were exact but ~300 were still wrong. This was killing me, eventually I found that there is certain "switch" in compression function that I had to force one way only. Then it compressed correct all, but some files made app crash. After another headache I made a *very* dirty workaround that even to my own surprise worked and I got 90000+ files all crc perfect and no crash. I still may redone it though.

So long story short, with LZ4 you not only need to match/guess correct parameters, you also need right version AND you still may need to make certain changes in code to match same compression behavior. With that said, now it should be easy for everyone to figure and re-mod for other games as needed - once you compare src diffs for changes - as they are very small.

Oh and that block crc... game archive files use "block size -> block crc -> cmp data" block pattern(you learn that from bms script), lz4 specification use crc after block end not beginning! So that had to be taken care of as well. Now whole dump is match including block's crc btw.

Ok done with story:
I include both original and moded bms script as well as both original and modded lz4 src, so you can learn. There is also compiled moded binary as well. This was tested for Raiders of the Broken Planet Wardog Fury, but could/should work for all series as well as Castlevanias from Mercury Steam. Other games I dont know, it depend on how their files lz4 blocks are structured, how trigger happy are they with crc's and so on. You can learn that from their bms scripts. If files are same pattern as above then its only matter of matching compression options and/or lz4 version. For this game you use "-9 -B5 --no-frame-crc" parameters and nothing else. Compressor was not tested with other parameters(but should work)!

Hopefully this will help encourage people into advanced repacking, join and become *Russian Hacker* NOW :cool:.

Attachment 20654

Attachment 20655

Attachment 20656

Attachment 20657

NEW:
Attachment 20659

elit 08-01-2018 19:23

Mod2 upped, small code cleanup and final changes. It should be final for these games.

"-9, --no-frame-crc and -B5" flags are now made default, also shown in help cmd options. No need to invoke with explicit params anymore(unless you need different), just lz4.exe FILE.

Razor12911 08-01-2018 19:46

Quote:

Originally Posted by elit (Post 465701)
LZ4 is a whore.

I couldn't have said it better myself :)

elit 09-01-2018 05:50

Quote:

Originally Posted by Razor12911 (Post 465708)
I couldn't have said it better myself :)

I am curious, do lzo, zlib and others(zstd,...) have similar problems(of different output per different version) or are they more stable in that regard?

Razor12911 09-01-2018 20:21

zstd has this problem as well. lzo, not so sure but it shouldn't have this problem since all variants of lzo come with binaries/libraries, lzo1x, lzo2a and so on.
zlib doesn't have this.

Gupta 09-01-2018 23:31

you can build zstd with legacy support. Then it can decompress any version

elit 10-01-2018 04:14

Quote:

Originally Posted by PrinceGupta2000 (Post 465736)
you can build zstd with legacy support. Then it can decompress any version

I am more interested in compression variability though, you need to match crc in most game archives.

78372 10-01-2018 04:40

Probably zlib is the one which has almost no problems :p

Gupta 10-01-2018 05:08

Code:

/*! ZSTD_isFrame() :
 *  Tells if the content of `buffer` starts with a valid Frame Identifier.
 *  Note : Frame Identifier is 4 bytes. If `size < 4`, @return will always be 0.
 *  Note 2 : Legacy Frame Identifiers are considered valid only if Legacy Support is enabled.
 *  Note 3 : Skippable Frame Identifiers are considered valid. */
ZSTDLIB_API unsigned ZSTD_isFrame(const void* buffer, size_t size);


Razor12911 10-01-2018 05:28

PrinceGupta there is a reason ztool comes with external libraries, it’s for user to change them until he finds one that was used in a particular game. Decompression is never a problem, but recompression is. When it comes lz4 and zstd, everything must be precise else you will get crc mismatch every time. Perhaps a bit with ztool since it comes with internal diff patching functions when user used a dll that was not used for compression in first place

elit 10-01-2018 12:00

Quote:

Originally Posted by Razor12911 (Post 465735)
zstd has this problem as well. lzo, not so sure but it shouldn't have this problem since all variants of lzo come with binaries/libraries, lzo1x, lzo2a and so on.
zlib doesn't have this.

Now I see the pattern, both lz4 and zstd are developed by same author. Well, at least his code is nicely organized and easy to learn.

Quote:

Originally Posted by Razor12911 (Post 465746)
PrinceGupta there is a reason ztool comes with external libraries, it’s for user to change them until he finds one that was used in a particular game.

OMG I kill you man, thats great! How come I did not know.. Time to do my games all over again :o.
EDIT: Well actually, thinking of it it may not be enough with lz4/zstd. I dont know how you coded it but in this case for example, default library would read block structure wrong even if compression match because of mentioned crc position which is part of the block, unless your tool is skipping it and can focus only on game data.
Still, it gave me an idea that one perhaps doesnt need to repack whole game archives with quickbms anymore, because depending on how your ztool works, maybe all I need is to modify library as I did here and ztool could function again. That would be great and would made ztool universal, then all you do is compile all lz4 versions and recycle them forever. But I would like to know more in detail how your tool work/read data exactly to understand, though its probably your secret so I am not going to ask.

Razor12911 11-01-2018 19:14

Quote:

Originally Posted by elit (Post 465761)
OMG I kill you man, thats great! How come I did not know.. Time to do my games all over again :o.
EDIT: Well actually, thinking of it it may not be enough with lz4/zstd. I dont know how you coded it but in this case for example, default library would read block structure wrong even if compression match because of mentioned crc position which is part of the block, unless your tool is skipping it and can focus only on game data.
Still, it gave me an idea that one perhaps doesnt need to repack whole game archives with quickbms anymore, because depending on how your ztool works, maybe all I need is to modify library as I did here and ztool could function again. That would be great and would made ztool universal, then all you do is compile all lz4 versions and recycle them forever. But I would like to know more in detail how your tool work/read data exactly to understand, though its probably your secret so I am not going to ask.

Yes, tool focuses on game data, error codes are returned all the time, like you said, library reading block structure incorrectly, it happens a lot, but it still returns decompressed data.
I can tell you how it works if you want, it's no secret.

elit 12-01-2018 12:34

Quote:

Originally Posted by Razor12911 (Post 465810)
...I can tell you how it works if you want, it's no secret.

That would be helpful, you can PM me or post it here whatever you prefer.

I actually tried to use lib from this version that I know works with the game and could not get it unpack anything with ztool(output same size). But I am not sure if I did it right, you see your lz4.dll have ~60kb while my compiled had around ~200+kb and I made .dll by manually changing function in makefile code - in function "add_library" where I added "SHARED" flag, something like this:
Code:

add_library(liblz4 ${LZ4_SRCS_LIB}) >> add_library(liblz4 SHARED ${LZ4_SRCS_LIB})
Now the thing is if I dont do that and just check to make lib in cmake gui, it does output library around same size as yours(~60kb), but its a .lib not windows dll. When I tried that big dll ztool did not complain with errors but output same size, if I rename lib to dll it output error(obviously not same). Maybe someone can hint me here what I did wrong, you can also try yourself if you can compile more properly.

Additionally, I found on zenhax forums that zstd may not be so bad after all, it does keep track of different version in header:
"zstd
It starts with a little endian 32bit magic number, when seen with a hex editor only the first byte (the low 8bit) is different because it depends by the version of the algorithm:
Code:

1e b5 2f fd    v0.1
22 b5 2f fd    v0.2
23 b5 2f fd    v0.3
24 b5 2f fd    v0.4
25 b5 2f fd    v0.5
26 b5 2f fd    v0.6
27 b5 2f fd    v0.7
28 b5 2f fd    v0.8, current version"

Unfortunately lz4 dont do it, but its the only one at least. Of course, there still have to be a header present in game resource to know.

Razor12911 12-01-2018 20:51

Well first of all, do you know ztool's plz4 and pzstd wasn't set to be universal?

elit 13-01-2018 07:48

Quote:

Originally Posted by Razor12911 (Post 465835)
Well first of all, do you know ztool's plz4 and pzstd wasn't set to be universal?

Yeah I guess you hardcoded it in a specific way, but feel free to elaborate on it in more detail if you can, any additional info is more than welcome. I also do know that some game data packs utilize various kind of obfuscation and encryption. But since you mentioned that external libs were made intentionally so that user can replace them with different version for different games, I assumed it was designed to be more universal at least for basic archive formats.

Aside of that, I just realized why even my modified version may not work here with ztool. I modified compression function, but not decompression - which still likely read block's incorrectly. If ztool use it to verify data, no wonder it wont work. Of course, as I said any additional info regarding this issue that may help me understand your tool better is more than welcome.

panker1992 29-05-2020 11:36

Sorry to revive this graved topic from the afterlife.

I have briefly some questions as i am trying to get into lz4 for the " i dont know i lost count " time.

you give some tools above a story and an explanation.

1) how to use the script to Import the files back in?? the files are decompressed and they will give error.
Q)Where does it say how to import them and with what parameters ?

2) How do you zero out the entire space of the files you are willing to remove? for example sound.
Q)Castlevania is mixed with sounds and they are wav/ogg files and they take up alot of space? i only want to zero out those and then reimport them but your script gives error.

elit 05-07-2020 13:46

1 Attachment(s)
Panker, I simply wrote bat script:
Attachment 27348


As for null, either with quickbms -Z or rawinjector I made. You have to figure things the rest for yourself, it was a long time.

panker1992 08-07-2020 12:54

Thanks for giving guidance !

I have posted several youtube videos on very advanced compressing and analyzing data :D

sorry to bother this old grave of a post, let it rest in peace :D

L33THAK0R 03-08-2021 20:34

Ah fuck I didn't realise LZ4 was such a bitch, also on what Razor said about Z/XTool LZ4 support not being designed to be universal that explains a lot of my troubles with the codec. I've been trying to pack "Demon's Souls" (contains LZ4 streams) and was wondering why Z/XTool wouldn't work with it, I guess this is the reason why.

panker1992 04-08-2021 11:20

Last time i checked demon souls isnt out for pc yet !!!

are you telling me you have a secre that it may come out ? :D

KaktoR 04-08-2021 11:53

@panker
https://fileforums.com/showpost.php?...&postcount=396

panker1992 04-08-2021 13:46

if it's possible i want sample data of lz4 streams

i think it's possible

L33THAK0R 04-08-2021 17:37

Quote:

Originally Posted by panker1992 (Post 493568)
Last time i checked demon souls isnt out for pc yet !!!

are you telling me you have a secre that it may come out ? :D

@panker1992, sorry I should have clarified. I'm currently repacking console exclusives, as I've figured out how to pack some trickier microsoft/sony container-types, and have been getting good ratios. Although Xenia (Xbox 360 emulator) progress is slow, RPCS3 (PS3 emulator) development has had a decent uptick in progress. "Demon's Souls" for the PS3 (not PS5) contains LZ4 streams, however this is based entirely off of investigating the methods used in 2 repacks of the title (A Darack team-member, & Gnarly), the former of which is currently offline due to personal issues, and the latter being unable to recall their methodology. I currently can't say which files (of which there are ~6000) contain the LZ4 streams, as the GFS tool is unable to detect any.

I'm also not sure how to identify if LZ4 compression was used, like how the Authors other project ZSTD can be, by examining the first 4 bytes in a hex editor.

Masquerade 05-08-2021 08:30

LZ4 can be used without header info. This also confused me at first. I'm out of country currently, but I definitely intend to do some more research when I get back.

Btw you could encounter problems analysing the data since desktop PCs are little endian whereas the PS3 had a CPU made by PowerPC which is big endian iirc.

L33THAK0R 05-08-2021 10:35

Quote:

Originally Posted by Masquerade (Post 493574)
LZ4 can be used without header info.

This is both fascinating and slightly depressing, just when I think I'm starting to understand the basics of these algorithms, I realise I'm yet to even scratch the surface.

Quote:

Originally Posted by Masquerade (Post 493574)
Btw you could encounter problems analysing the data since desktop PCs are little endian whereas the PS3 had a CPU made by PowerPC which is big endian iirc.

Damn I hadn't considered the architecture differences. I suppose though it shouldn't be too hard to determine if a mainstream compression algorithim is being used, through trial and error, should a universal methodology for LZ4 pre-compression be developed (which from the sounds of things would be no easy feat).

Masquerade 05-08-2021 12:49

Quote:

Originally Posted by L33THAK0R (Post 493575)
I realise I'm yet to even scratch the surface.

Welcome to the party.

FitGirl 06-08-2021 17:35

LZ4 has been released only in 2011, it could not exist in Demon's Souls. If any LZ algo is used in there - it's something custom, like LZSS, Japanese devs love it.

L33THAK0R 06-08-2021 19:18

Quote:

Originally Posted by FitGirl (Post 493587)
LZ4 has been released only in 2011, it could not exist in Demon's Souls. If any LZ algo is used in there - it's something custom, like LZSS, Japanese devs love it.

Rats! I hadn't considered that, that begs the question how ztool was used in the repacks I mentioned to be able to pack these streams. It'll definitely be interesting to see if anyones able to figure this one out, but given how niche this hobby is, and more niche still with console game-data compression, I think this might just remain a mystery for a while longer!

Masquerade 07-08-2021 00:01

Quote:

Originally Posted by L33THAK0R (Post 493588)
Rats! I hadn't considered that, that begs the question how ztool was used in the repacks I mentioned to be able to pack these streams. It'll definitely be interesting to see if anyones able to figure this one out, but given how niche this hobby is, and more niche still with console game-data compression, I think this might just remain a mystery for a while longer!

Well if you can see what plugin was used for ZTool, that might help you.

L33THAK0R 07-08-2021 01:50

Quote:

Originally Posted by Masquerade (Post 493589)
Well if you can see what plugin was used for ZTool, that might help you.

All I was able to gather from the installer and testing the archives was that, at least for Gnarys repack of the title, "plz4" was used. I haven't been able to look at the Darack teams repack of it as no mirrors of it exist currently, but I theorise a similar method was used.

FitGirl 07-08-2021 15:53

Quote:

Originally Posted by L33THAK0R (Post 493588)
Rats! I hadn't considered that, that begs the question how ztool was used in the repacks I mentioned to be able to pack these streams. It'll definitely be interesting to see if anyones able to figure this one out, but given how niche this hobby is, and more niche still with console game-data compression, I think this might just remain a mystery for a while longer!

It means those streams were picked up by detector wrongfully and it generated some delta for restoring non-existing streams. I bet packing the game w/o LZ4 precompression will give better result than with it.

Headerless LZ4s are very hard to detect, so many false streams are found. They even decompress, but are not real streams :)

Razor would tell you a whole book of such stories.

L33THAK0R 07-08-2021 20:00

Quote:

Originally Posted by FitGirl (Post 493595)
It means those streams were picked up by detector wrongfully and it generated some delta for restoring non-existing streams.

Alright I think I understand the gist of this, I didn't realise this could happen, its quite interesting honestly.

Quote:

Originally Posted by FitGirl (Post 493595)
I bet packing the game w/o LZ4 precompression will give better result than with it.

I hope I'll be able to figure out how to squash the main data-files down, since it seems the title can be compressed a considerable amount.

Quote:

Originally Posted by FitGirl (Post 493595)
Headerless LZ4s are very hard to detect, so many false streams are found. They even decompress, but are not real streams :)

Thankfully I'm yet to encounter many titles that use this algorithm, fingers crossed one day I'll be equipped to handle such a monster!


Quote:

Originally Posted by FitGirl (Post 493595)
Razor would tell you a whole book of such stories.

If he ever comes back, I'd honestly love to hear more about this


All times are GMT -7. The time now is 20:34.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2026, vBulletin Solutions Inc.
FileForums @ https://fileforums.com